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ABSTRACT 

This  thesis  describes  a  VHSIC  Hardware  Description  Language  (VHDL)  simulation  of 
a  hardware  8x8  Discrete  Cosine  Transform  (DCT)  which  can  be  applied  to  image 
compression.  A  Top-Down  Design  approach  is  taken  in  the  study,  a  discussion  of  DCT  theory 
is  presented,  along  with  a  description  of  the  1-D  DCT  circuit  architecture  and  its  simulation  in 
VHDL.  Results  of  the  2-D  DCT  simulation  are  included  for  two  simple  test  patterns  and  verified 
by  hand  calculation,  demonstrating  the  validity  of  the  simulation.  Shortcoming  found  in  the 
simulation  are  described,  together  with  suggestions  for  correcting  them.  In  the  future,  the  VHDL 
description  of  the  8  x  8  image  block  2-D  DCT  can  be  further  developed  into  structural  and 
gate-level  description,  after  which  hardware  circuit  implement  can  occur. 
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I.    INTRODUCTION 

A.  LITERATURE  BACKGROUND 

This  thesis  is  basically  developed  from  the  paper  "An  8  x  8  Discrete  Cosine 
Transform  Chip  with  Pixel  Rate  Clock"  by  D'Luna,  L.  J.  [Ref.  1].  The  original  paper 
introduced  the  algorithm  and  implementation  of  one-dimensional  (1-D)  as  well  as  two- 
dimensional  (2-D)  Discrete  Cosine  Transform  (DCT)  where  the  principle  of  distributed 
arithmetic  is  used.  According  to  the  algorithm  introduced,  hardware  circuit  architecture 
was  implemented. 

Another  very  important  aspect  discussed  in  this  thesis  is  the  implementation  of  a 
"Top-Down  Design"  concept  that  uses  Very  High  Speed  Integrate  Circuit  (VHSIC) 
Hardware  Description  Language  [Ref.  4-8]  as  a  tool.  "Top-Down  Design"  is  a  kind  of 
design  that  describes  the  given  algorithm  with  a  high  level  language  first.  After  the 
algorithm  is  described,  the  structural  architecture  is  described  next.  Finally  this  structural 
description  is  developed  into  hardware  circuit.  VHDL  facilitates  the  algorithm 
description,  structural  description  as  well  as  hardware  circuit  simulation. 

B.  OBJECTIVE 

The  purpose  of  this  thesis  is  to  describe  the  behavior  of  the  implemented 
architecture  of  the  algorithm  mentioned  above  with  VHSIC  Hardware  Description 
Language   (VHDL).   It  was   simulated  on  a  workstation  in  order  to  analyze  the 


characteristics.  In  the  process  of  describing  the  behavior  of  this  structural  architecture, 
complicated  hardware  circuits  are  developed  in  behavior  models.  This  is  usually  the  first 
step  in  a  "Top-Down  Design"  task.  The  objective  is  to  use  a  DCT  implementation  as  an 
example  to  study  the  "Top-Down  Design"  methodology. 

C.  RATIONALE  FOR  USING  VHDL  TO  DESCRIBE  THE  CIRCUIT 

In  the  past,  VHSIC  design  was  dominated  by  bottom-up  design  methodologies 
where  hardware  circuit  details  were  established  and  produced  before  the  system  was 
constructed  [Ref.  4].  This  methodology  is  very  useful  in  dealing  with  small  circuits. 
However,  when  the  system  gets  complicated,  bottom-up  design  methodology  is  more 
difficult  to  handle.  In  this  work,  a  high-level,  top-down  design  approach  is  taken. 
Initially,  a  description  of  the  algorithm  is  written.  Later  on,  a  detailed  architecture  is 
described.  All  are  done  in  VHDL.  VHDL  is  a  hierarchical  hardware  description  language 
which  supports  mixed-level  simulation.  This  thesis  shows  the  beginning  steps  for  a  "Top- 
Down  Design"  approach.  The  8x8  image  block  DCT  algorithm  were  implemented  into 
a  behavior  model  and  a  structural  model.  VHDL  were  used  here  to  accomplish  the  initial 
design  of  the  1-D  Discrete  Cosine  Transform  implementation. 

D.  OVERVIEW  OF  THE  THESIS 

There  are  six  chapters  in  this  thesis.  The  first  chapter  is  an  introduction  to  the 
literature  background,  the  objective,  and  the  reasons  for  using  the  VHDL.  Chapter  II 
introduces  the  algorithm  of  Discrete  Cosine  Transform  and  the  principle  of  distributed 
arithmetic.  Chapter  III  examines  the  components  of  the  structural  architecture.  Chapter 


IV  gives  the  actual  VHDL  behavioral  description  of  the  components,  its  actual  circuit 
block  diagram,  and  its  connections.  Chapter  V  analyzes  the  simulation  results  and  gives 
some  experience  on  design  problems.  The  last  chapter  is  the  conclusion. 


n.    BASIC  DISCRETE  COSINE  TRANSFORM  THEORY 

A.      DISCRETE  COSINE  TRANSFORM  IN  IMAGE  COMPRESSION 

1.       Rationale  for  using  Discrete  Cosine  Transform 

Image  transmission  or  storage  usually  deals  with  a  large  amount  of  digital 
data.  There  are  usually  512  x  512  pixels  in  a  monochrome  picture.  If  one  pixel  needs 
8  bits  to  represent  its  information,  transmitting  a  monochrome  picture  means  that  more 
than  two  megabits  (512  x  512  x  8  =  2,097,152)  of  digit  data  need  to  be  transmitted. 
There  are  many  ways  to  do  coding,  compressing  huge  amounts  of  data  to  reduce  the 
transmission  bandwidth  and  the  amount  of  storage  space  required.  Among  these  methods, 
transform  domain  compression  is  an  effective  way  to  eliminate  the  redundant  information 
in  images,  since  image  data  are  usually  highly  correlated. 

Image  transformation  is  used  to  extract  a  small  number  of  significant 
coefficient  values  from  the  original  image,  by  mapping  the  image  data  onto  a  two- 
dimensional  spectrum.  Each  coefficient  in  the  transform  domain  represents  some  amount 
of  energy  of  the  spectral  component.  The  original  spatial  image  can  then  be  recovered 
back  from  these  coefficients,  since  each  image  has  its  own  specific  spectral  pattern.  After 
the  transformation,  there  are  only  a  few  coded  values  required  to  describe  the  original 
image.  Consequently,  it  is  possible  to  save  bits  during  transmission  and  storage. 

The  Fourier  transform  algorithm  has  been  applied  to  image  processing  for  a 
long  time,  since  it  possesses  many  desirable  analytic  properties.  But,  it  has  two  major 


drawbacks.  First,  the  computation  of  the  Fourier  transform  involved  complex  numbers 
rather  than  real  numbers.  Secondly,  the  decreasing  rate  of  spectrum  energy  as  frequency 
increases  is  low.  This  low  decreasing  rate  in  the  spectrum  is  a  very  significant 
disadvantage  in  image  coding. 

The  Discrete  Cosine  Transform  (DCT)  has  the  advantage  of  involving  only 
real  number  computations.  It  is  well  suited  for  image  data  compression.  Consequently, 
8x8  image  blocks  of  two  dimensional  cosine  transforms  have  been  adopted  as  an 
international  standard  draft  (JPEC)  [Ref.  1].  This  thesis  concentrates  on  studying  the 
Discrete  Cosine  Transform  and  building  a  circuit  for  8  x  8  image  blocks. 

2.       Formulae  of  the  Discrete  Cosine  Transform 

The  general  formula  of  a  one-dimensional  Discrete  Fourier  Transform  (1-D 
DCT)  is  expressed  as 


AM 

E 

i>0 


z,  =  E  xp*  (1) 


where  Zk  is  the  transform  of  X„  Ctt  is  the  forward  transformation  kernel,  and  i  and  k 
range  from  0  to  N  -  1.  The  inverse  transform  of  the  1-D  DCT  is  given  by  the  relation 
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where  h^  is  the  inverse  transformation  kernel.  The  characteristic  of  the  transform  is 
determined  by  its  transformation  kernel  properties. 


The  1-D  DCT  forward  kernel  is  given  by 
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Substituting  Eq.  (3)  and  (4)  into  Eq.  (1)  yields 
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where  Zk,  k  =  0,  1,  2,  ...  ,  N-  1,  is  the  1-D  DCT  of  Xfl). 

The  inverse  kernel  is  of  the  same  form  as  Eq.  (3)  and  (4),  so  that  the  inverse 
DCT  is  expressed  by  the  equation 
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where/  =  0,  1,  2,  ...  ,  N-  1. 

The  two-dimensional  forward  DCT  kernel  is  given  as 
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where  i,j  =  0,  1,  ...  ,  2V-  1,  and  k,  I  -  1,  2,  ...  ,  N-  1.  The  inverse  kernel  is  also  of 
this  form.  Thus,  the  two-dimensional  DCT  pair  is  expressed  by 
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where  fc,  /  =  1,  2,  ...  ,  N  -  1,  and 
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where  i,  y  =  0,  1,  ...  ,  N  -  1. 

It  can  be  seen  that  DCT  transformation  kernels  are  separable  from  Eqs.  (3), 
(4),  (8),  and  (9).  Therefore,  the  two-dimensional  forward  or  inverse  transformation  can 
be  computed  by  applying  two  one-dimensional  DCT  operations  successively. 


B.      ALGORITHM  FOR  8  BY  8  IMAGE  DISCRETE  COSINE  TRANSFORM 

1.       Methodology  of  2-D  DCT 

Let  Xy  denote  an  image  pixel  value,  which  is  an  n-bit  number.  The  indices 
i  and  j  represent  the  row  and  column  location  of  the  pixel,  respectively.  The  N  x  N 
two-dimensional  DCT  can  be  expressed  by 

N-l  N-l 
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Zu  is  the  spectral  coefficient  corresponding  to  the  k*  horizontal  frequency  and  P*  vertical 
frequency.  In  matrix  notation,  the  inner  summation  is  equivalent  to  a  1-D  DCT 
computation  on  the  columns  of  X.  The  outer  summation  is  equivalent  to  a  1-D  DCT 
computation  on  the  rows  of  the  inner  summation  results.  C  can  be  used  to  represent  the 
2-D  DCT  matrix.  It  has  the  1-D  DCT  basis  vectors  which  are  elements  C^  (1-D  DCT 
kernels),  where 


Cm0  =  —  m  =  0,U,...^-1  (15) 


c* 


2       (2m  +  \)kn  (16) 

.   —  cos- —  v     ' 

\  N  IN 


m  =  0,  1,  2,  ...  ,  N-l;  k  =  1,  2,  ...  ,  N-l.  Because  the  kernels  of  the  DCT 
transformation  can  be  separated,  the  2-D  matrix  Z  of  2-D  DCT  coefficients  can  be 
represented  as 


Z  =  [X'Q'C  =  C'XC. 


(17) 


This  process  can  be  realized  in  an  architecture  shown  in  Fig.  1  (referred  to  Ref.  1]). 
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Fig.  1    2-D  DCT  Block  Diagram 

The  N  x  N  block  of  image  X  is  input  column  by  column  first,  and  the  1-D  DCT 
computation  is  done.  This  computation  is  carried  out  as  shown  in  the  square  bracket  of 
Eq.(17)  for  the/  column  (for  j  =  0,  1,  ...  ,  N  -  1).  The  result  of  this  N  x  N  matrix 
is  then  transposed  for  the  second  row  by  row  1-D  DCT  computation.  This  transpose  is 


done  as  described  by  term  on  the  outside  of  the  square  brackets  in  Eq.(17).  After  the 
transposition,  the  same  1-D  DCT  computation  involving  the  same  transform  matrix  C  is 
carried  out  again.  The  transpose  step  takes  care  of  the  column  to  row  change  operations 
of  the  data.  The  key  operations  involved  here  are  the  matrix  transpose  and  the  1-D  DCT 
computation. 

2.       Principle  of  distributed  arithmetic 

The  implementation  of  the  1-D  DCT  studied  here  is  based  on  the  principle 
of  distributed  arithmetic.  Using  this  principle,  it  is  possible  to  implement  the  "bit 
calculation"  into  the  chip  design.  "Bit  multiplication"  is  simply  carried  out  by  using  the 
input  data  bit  pattern  to  address  a  Read  Only  Memory  and  by  summing  up  all  the  results 
to  obtain  the  "transposed  spectral  values".  If  Yt  (y;  =  (yuJm=0N'1)  is  the  image  pixel  vaiue 
represented  by  a  row  vector,  then  its  1-D  DCT  is 


E 

m=0 


**  =  E  y*A*      <*  =  o,  i,..,  n  -  i.  <«> 


Now  the  input  data  v^  can  be  represented  in  2's  complement  notation  with 
p-bit  as 

*.  =  -yr*-'  *  £**  (19) 

where  yjq>  is  the  (f1  bit  of  the  incoming  image  pixel  values  v^  which  have  a  value  of 
either  0  or  1.  2q  is  the  binary  weight  of  the  (f1  bit.  For  example,  if  the  input  data  is  a  2's 
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complement  8-bit  pattern  then  y^  =  -yj*  x  27  +  yj0>  x  2°  +  yjl>  x  2'  +  yj*  X  22 
+  yJ3>  x^+  yj4>  x  24  +  yj5)  x  25  +  yj6>  x  2s.  Substituting  Eq.  (19)  into  Eq.  (18) 

4--Evf^l*EEv^  (20) 

m=0  q=0  m=0 

Z*  =  -F^C^-1^-1  *  J^FJPJPW  (21) 

q=0 


where  F&  is  a  function  of  the  vectors  Ck  and  Y/9)  and  is  represented  as 

*V<Vft  -  E  c^ff       /or  «  =  0,1,2,..,  P-1.  <22> 


m=0 


Its  binomial  form  can  be  shown  as 

F  (C  Y**\  =  c  v(9)  +  c  v(4)  +        +  c       viq)  (23) 

where,  q  =  0,  1,  ...  ,  p-1. 

3.       Methodology  for  forming  the  ROM  storage 

In  Eq.(23),  c^  are  1-D  DCT  basis  (kernels)  vectors  used  as  multiplication 
coefficients.  They  are  converted  from  decimal  numbers  to  the  2's  complement  notation 
used  in  this  thesis.  yjq>  are  the  bit  patterns  represented  in  2's  complement  form  of  the 
N  data  points  v^.  Because  the  basis  vectors  are  fixed  value  coefficients  and  F&  are 
functions  of  the  basis  vectors  and  the  binary  bit  patterns,  the  values  of  F&  (with  a  fixed 
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k)  for  all  possible  N  bit  patterns  (yjq)-  m  =  0,  1,  2,  ...  ,  N  -  1)  can  be  calculated  and 
stored  in  Read  Only  Memory  (ROM)  according  to  Eq.(22)  and  Eq.(23).  The  N-bit 
pattern  changes  with  time  according  to  the  incoming  data  yjq>  (m  =  0,  1,  2,...,  N-l). 
This  bit  pattern  will  form  an  address  to  access  the  ROM  to  extract  the  corresponding 
FJCkt  Rvalue. 

From  Eq.(20)  and  Eq.(21),  the  corresponding  1-D  DCT  spectral  coefficient 
2^  can  be  computed  by  shifting  and  adding  the  F&  values  stored  in  the  ROM.  In  Eq. 
(21),  F&  is  a  function  of  the  corresponding  basis  column  vector  Q  for  k  =  0,  1,  2,  ..., 
N-l.  Ft  is  different  from  each  other  as  k  varies.  The  incoming  data  vector  Yi  is  the  same 
for  the  multiplication  coefficients  involved  for  all  values  of  k.  It  is  possible  to  build  up 
N  separate  memory  banks  of  multiplication  coefficients  and  compute  the  N  1-D  DCT 
spectral  coefficients  Zrt  (k  =  0,  1,  2,...,  N-l)  in  parallel  or  concurrently. 

4.       Exploiting  the  symmetry  in  DCT  to  save  storage  in  ROM 

Here,  8x8  image  blocks  are  used,  so  N  =  8.  The  incoming  data  has  8  bits. 
This  means  28  =  256  possible  bit  patterns  will  be  formed  into  addresses.  There  shall  be 
256  corresponding  multiplication  coefficient  sum  stored  in  the  ROM  for  each  of  the  8 
DCT  spectral  coefficients.  However,  advantage  can  be  taken  of  the  symmetry  in  the  DCT 
basis  vectors.  It  can  be  shown  that 

cmk  =  CN-i-mjc         f°r  k  =  °>2,"JV-2    (k  even).  (24) 

For  example, 
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^02 


\ 


Icos—  = 

8  16 


\ 


2       30* 

—cos =  C 

8         16 


72 


(25) 


where  c^  is  defined  by  Eq.  (15)  and  Eq.  (16).  And  the  following  can  be  proven, 


Cn*    =    ~CN-X-mM  f°r  k    =    l>   3>     "'   ^_1    C*  Odd) 


(26) 


For  example, 


C0,  = 


\ 


2        n 
—cos 

8       16 


> 


2       15ti  _ 

—cos =  -C 

8        16 


71- 


(27) 


Hence,  Eq.  (18)  can  be  reduced  to 


A//2-1 
m=0 


w/tere  k  =  0,  2,...,  Af-2  (it  even) 


(28) 


and, 
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N/2-1 

zik  =  E  Cv,m  -  y^.i-jc^ 


where  k  =  1,  3,...,  N-l  (it  odd).  (29) 


Equations  (22)  and  (23)  then  can  be  reduced  to 


N/2-1 


FJC^  =  E  CJym  ♦  yiJhl_jF* 


m=0 


w/iere  k  =  0,  2,  4,...,  W-2  (30) 


N/2-1 

E 

m=0 


W,if>  =  E  <^„  -  r^j» 


w/iere  A:  =  1,  3,  5,...,  AM.  (3D 

From  the  above  equations,  it  is  possible  to  add  or  subtract  the  incoming  data 
points  before  memory  access  and  reduce  the  number  of  distinct  data  values  in  ROM  from 
N  to  N/2.  The  total  number  of  bit  patterns  is  now  only  2N/2  =  24  =  16.  Only  a  16  word 
ROM  is  necessary  for  each  of  the  8  DCT  coefficients,  and  therefore  a  total  of  16  x  8 
=  128  word  ROM  is  required.  This  savings  of  ROM  storage  is  significant  compared  to 
the  cost  of  using  adders  and  subtracters  in  a  different  architecture.  Since  there  is  only 
one  particular  bit  pattern  (those  bits  which  have  the  same  binary  weight)  at  a  time 
allowed  to  address  the  ROM,  and  bit  pattern  changes  according  to  the  serially  coming 
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data,  the  addition  and  subtraction  can  be  done  in  a  bit  serial  fashion.  This  advantage  is 
exploited  in  the  chip  implementation  discussed  in  the  next  chapter. 


15 


ffl.    A  STRUCTURAL  ARCHITECTURE  FOR  THE  1-D  DCT 


8x8  IMAGE  BLOCK  1-D  DCT  CIRCUIT  ARCHITECTURE 

The  1-D  DCT  architecture  studied  previously  is  shown  in  Fig.  2  [Ref.  1].  There 
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Fig.  2   Architecture  of  1-D  DCT 


K 


are  8  slices  parallel  to  each  other  corresponding  to  the  8  DCT  coefficients  which  are 
computed  concurrently.  First,  12-bit  pixels  AI(11:0)  are  put  column  by  column  into  the 
"serial-in-parallel-out"  shift  register  (A).  This  sequence  needs  8  clock  cycles  to  complete. 
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After  the  8th  clock,  the  shift  registers  output  the  data  into  the  "parallel  load  2-bit  serial 
shift  register"  (B)  at  once.  This  is  completed  at  the  9th  clock  cycle.  At  the  same  time,  the 
serial-in-parallel-out  shift  registers  also  get  their  new  incoming  data.  The  data  stored  in 
the  B  shift  register  has  to  be  added  or  subtracted  according  to  Eqs.  (30)  and  (31)  in  order 
to  reduce  the  ROM  storage.  In  order  to  make  Eqs.  (30)  and  (31)  more  understandable, 
they  are  expanded  as  below 

JV/2-1 
m=0 

m=0  m=l  m=2  m=3 

=  C00(YH)+Yr7)^+C10(Ya+YK)W+C20(Ya+Yi5)(*+C3o(Ya+Yi4)^----k  =  0 

+ Cn(  Yl0+ Y£<* +C12(Y„  +  Ym)« + C22(Ya  +  Ya)«> + C32(  Ya  +  Y^—k  =  2 
+CM(Y1o+Yi7)^+C14(Yll  +  Yl6)^+C24(Yl2  +  Yl5)^  +  C34(Yl3  +  Yl4r----k  =  4 
+C06(Ya+Yp)W+Cl6(Yn  +  YiJ(«+CM(Yi2+YiS)<«  +  C36(Ya  +  Yi4)W----k  =  6 

NI2-1 
m=0 

m  =  0  m=l  m  =  2  m  =  3 

=Co,(Yl0-Yl7)^+C11(Yil-Yi6)^+C21(Yl2-Yl5)^+C31(Yl3-Yl4)^--k=  1 
=Co3(Yl0-Yi7)^+C13(Yll-Yl6)^+C23(Yl2-Yl5)^  +  C33(Yl3-Y(4)^--k  =  3 
=Co5(Yi0-Yl7)^+C15(Yll-Yl6)^+C25(Yi2-Yl5)^+C35(Y13-Yl4)^--k  =  5 
=Ca7(Yi0-Yi7)^+C17(Yil-Yi6)^+C27(Yl2-Yl5)^+C37(Yl3-Yi4)^--k  =  7 
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The  numbers  above  the  expanded  equation  represent  the  index  m,  and  the  numbers  on 
the  right  side  are  the  index  k.  C^  are  multiplication  coefficients.  The  bit 
addition/subtraction  is  determined  according  to  whether  k  is  an  even  or  odd  number. 

Registers  B  must  be  emptied  in  less  than  8  clock  cycles  in  order  to  receive  new 
data  coming  from  registers  A.  Each  datum  is  12  bits  in  length.  If  a  single  bit  is  coming 
out  of  registers  B,  it  will  take  12  clock  cycles  to  empty  the  register.  This  will  cause 
collision  during  the  addition  and  subtraction  of  the  data.  There  are  two  ways  to  solve  this 
problem  ;  either  to  clock  register  B  twice  as  fast  or  to  shift  out  data  2  bits  at  a  time.  The 
latter  alternative  has  been  chosen  for  the  reasons  of  convenient  design  and  easy  system 
considerations.  The  shifted  2-bit  data  is  added  or  subtracted  in  the  "2-bit 
adder/ subtractor"  C.  Their  output  is  stored  in  the  shift  registers  D  which  split  the  least 
significant  bit  and  most  significant  bit  (binary  weight  q  =  0  and  q  =  1)  into  two  output 
lines. 

Next  comes  the  question  as  to  where  the  output  data  of  the  adders  and  subtracters 
should  go  to  address  the  ROM.  How  should  the  values  in  the  ROM  be  arranged?  It  is 
shown  in  the  above  expanded  equations  that  all  the  adder  outputs  which  is  designated  as 
(U0(0:3)  and  Ut(0:3)  (Refer  to  Fig.  2).  They  are  the  4  bits  patterns  which  are  the  sum 
of  the  two  adjacent  bit  Yjq).  q  =  0  represents  LSB  bit  and  q  =  1  represents  MSB  bit  in 
Eqs  (20)  and  (21).  (U0(0:3)  and  Ui(0:3)  should  be  multiplied  by  the  coefficients  C^, 
where  k  =  0,  2,  4,  6.  All  the  two  adjacent  difference  output  V0(3:0)  and  V1(3:0)  should 
be  multiplied  by  the  coefficients  C^,  where  k  =  1,  3,  5,  7.  As  a  result,  the  four  adders 
and  subtractors  output  bit  patterns  form  a  4-bit  address  to  access  the  corresponding 


18 


accumulated  sum  of  the  coefficients  C,,*,  k  =  0,  1,...  7  which  are  stored  in  ROM  E. 
This  step  will  accomplish  the  1-D  DCT  coefficient  multiplication.  The  output  of  the 
ROM  is  first  latched  in  register  F,  and  then  adder/ subtractor  G  will  calculate  the  sum  of 
the  "2-bit"  spectral  coefficient  values  according  to  Eq  (21).  The  LSB  (q  =  0)  values  are 
shifted  to  the  right  one  position  and  added  to  the  q  =  1  values.  This  addition  will 
continue  until  the  last  bit  pattern  (12th)  of  the  incoming  column  data.  According  to  Eq. 
(19),  the  incoming  data  have  been  represented  in  2's  complement  notation,  so  the  most 
significant  bit's  value  should  be  subtracted  from  all  the  previous  summations.  This  is 
done  by  changing  the  add/sub  control  line  of  G  into  subtraction  at  the  clock  cycle  of  the 
last  bit  pattern  for  each  column  of  data. 

The  2-bit  sum  or  difference  results  of  G  are  stored  into  register  H  and  then  sent  to 
the  accumulator  I  and  J.  The  accumulator  consists  of  one  "16-bit  adder"  and  a  "shift 
right  2-bit  register".  The  value  stored  in  ROM  E  is  a  16-bit  word.  The  16-bit  adder  I 
adds  the  previous  2-bit  right  shifted  value  (output  of  J)  to  the  incoming  value  (output  of 
H).  The  resulting  value  then  is  output  to  J  register  to  do  the  2-bit  right  shift.  This  process 
will  accomplish  the  computation  of  Eq.  (21)  as  index  q  varies  from  0  to  p-1  in  2  bit 
increments.  One  thing  has  to  be  noted  with  caution;  the  initial  value  in  the  shift  right  2- 
bit  registers  for  every  incoming  column  of  data  should  be  zero.  Otherwise,  the  previous 
column  values  would  accumulate.  To  avoid  this,  just  clear  the  shift  right  2-bit  register 
at  the  beginning  of  the  accumulation  of  every  column  group. 

After  8  clock  cycles,  the  accumulated  values  are  parallel  loaded  into  register  K. 
Similar  to  register  A  but  in  the  reverse  direction,  register  K  puts  out  the  1-D  DCT 
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spectral  coefficients  column  by  column.  These  1-D  DCT  coefficients  are  then  transposed 
by  the  transpose  RAM  (TRAM)  according  to  Eq.(17).  The  transpose  RAM  is  described 
in  the  next  section.  After  the  transpose  RAM,  1-D  DCT  coefficients  are  then  input  into 
again  the  same  1-D  DCT  architecture.  The  only  difference  now  is  that  the  registers  A 
and  B  have  to  be  expanded  from  12  bits  to  16  bits  for  the  second  transform. 

B.   TRANSPOSE  RAM  ARCHITECTURE 

According  to  Eq.  (17),  the  purpose  of  the  "transpose  RAM"  is  to  change  the  8  x 
8  1-D  DCT  coefficient  block's  columns  into  rows;  and  rows  into  columns.  The 
coefficient  values  are  generated  from  the  1-D  DCT  architecture  column  by  column. 
First,  these  values  are  put  into  a  RAM  while  the  transposed  values  are  written. 
Therefore,  the  transpose  RAM  must  have  the  capability  of  reading  in  the  1-D  DCT 
values  and  writing  out  the  transposed  values  in  the  same  cycle.  How  can  this  be  done? 

The  coefficient  values  come  out  of  the  1-D  DCT  architecture  in  serial  order;  the 
0,  1,  2,...,  7  coefficients  of  the  first  column  of  the  8  x  8  block  come  in  first  and  then 
the  0,1,...  7  coefficients  of  the  second  column  and  the  third  column  and  so  on.  This 
order  is  a  long  stream  of  coefficients  0,1,...  63  for  each  8x8  image  block.  After 
storing  them  in  the  RAM,  the  coefficients  must  be  read  out  in  groups  of  8  values  in  the 
order  of  0,  8,  16,...,  56;  1,  9,  17,...,  57;  2,  10,  18,...,  58;  3,  11,  19,...,  59;  4,  12, 
20,...,  60;  5,  13,  21,...,  61;  6,  14,  22,...,  62;  7,  15,  23,...,  63  to  achieve  the  transpose 
operation.  In  the  same  cycle,  just  after  reading  out  the  first  block  of  transposed  values, 
the  coefficient  values  of  the  second  block  can  be  written  into  those  locations.  It  is  just 
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like  reading  block  1_0  (first  8x8  block  position  0)  and  writing  block  2_0  (second  8  x 
8  block  position  0),  reading  block  1_8  and  writing  block  2_1,  reading  block  1_16  and 
writing  block  2_2,  and  so  on.  In  order  to  achieve  the  transpose  of  the  second  block,  the 
sequence  for  reading  out  block  2  must  be  in  the  order  of  0,  1,2,...  63.  When  reading 
out  the  coefficients  of  block  2,  the  third  block  coefficients  are  being  written  into  the  same 
locations  just  after  read  out.  The  order  is  just  like  reading  block  2_0  and  writing  block 
3_0,  reading  block  2_1  and  writing  block  3_1,  reading  block  2_2  and  writing  block  3_2, 
and  so  on.  Notice  the  sequential  order  is  0,  1,  2,... 63  first,  and  then  0,  8,  16,...,  and 
then  again  in  the  sequential  order  of  0,  1,  2,... 63,  and  so  on. 

As  shown  before  the  structural  architecture  design  is  based  on  the  principle  of 
distributed  arithmetic,  and  it  is  data-path  oriented.  The  methodology  to  describe  this 
architecture  in  VHDL  and  to  simulate  it  on  a  computer  are  discussed  in  the  next  chapter. 
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IV.  VHDL  BEHAVIORAL  DESCRIPTION  OF  THE  1-D  DCT  COMPONENT 


A.      BLOCK  DIAGRAM  DESCRIPTION 
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Fig.  3  1-D  DCT  block  diagram 

The  block  diagram  of  the  1-D  DCT  shown  in  Fig.  3  can  be  described  in  models 
using  VHDL.  The  block  diagram  shown  here  includes  a  1-D  DCT  system  discussed  in 
chapter  III  and  the  additional  clock  generators,  delay  lines,  control  line,  package  1,  and 
test  bench.  There  are  minor  differences  between  this  diagram  and  the  architecture 
described  in  the  previous  chapter.  What  is  taken  into  consideration  when  simulating  this 
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system  in  VHDL  is  that  a  signal  flow  latency  will  occur.  Therefore,  a  delay  line  is 
necessary  to  change  the  clock  triggering  time  and  solve  this  latency  problem. 
Additionally,  the  architecture  in  the  previous  chapter  does  not  make  it  clear  when  to 
control  the  add/sub  register  G  and  fulfill  the  calculation  of  summing  2's  complement 
values.  It  is  shown  here  that  the  control  line  generating  this  control  bit  is  triggered  by 
the  delayed  clock. 

From  the  modeling  point  of  view,  it  is  rather  complicated  to  build  up  a  16-bit  adder 
in  VHDL  following  the  usual  arithmetic  logics.  The  easiest  approach  is  to  convert  the 
16-bit  binary  coefficient  values  into  integer  numbers  and  then  do  the  addition  or 
subtraction  in  integers.  After  the  integer  addition  or  subtraction,  the  integers  are  simply 
converted  back  to  binary  values.  This  conversion  task  is  accomplished  by  functions  in 
package  1.  A  VHDL  package  is  a  collection  of  functions  and  procedures.  Of  course, 
some  overflow/underflow  situations  are  expected  to  occur  during  these  conversions.  One 
last  thing  to  note  in  Figure  3  is  that  the  test  bench  module  controls  all  the  signal  flow, 
the  input  data,  and  the  output  data,  and  it  also  simulates  the  whole  design. 

B.      BI-TO-DI  AND  DI-TO-BI  VHDL  PACKAGE 

the  package  1  in  VHDL  is  shown  below, 

package  packl  is    ~  Package  declaration 

procedure  bi_to_in  —  Procedure  1  changes  16  bits  binary  into  integer 

(variable  x  :  bit_vector(15  downto  0); 

variable  y  :  out  integer); 
procedure  in_to_bi  —Procedure  2  changes  integer  into  binary 

(variable  m  :  in  integer; 

variable  n  :  out  bit_vector(15  downto  0));end  packl; 
package  body  packl  is  —  Package  body  declaration 
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procedure  bi_to_in      —  First  procedure  that  changes  bits  to  integer 
(variable  x  :  bit_vector(15  downto  0); 
variable  y  :  out  integer)  is 
variable  sum  :  integer  :  =0; 
variable  p  :  bit_vector(15  downto  0); 
begin 
p  :  =  x; 

if  p(15)  =  T  then  ~  Change  negative  value  to  positive 

for  i  in  0  to  14  loop 
if  p(i)  =  T  then 
for  i  in  0  to  13  loop 
p(i  +  l)  :=  not  p(i+l); 
end  loop;  exit; 
end  if; 
end  loop; 
for  k  in  0  to  14  loop  —  Integer  conversion 

if  p(k)  =  '1'  then 

sum  :=  sum  +  2**k; 
end  if; 
end  loop; 

y  :  =  -sum;  —  Convert  back  to  negative  value 

else 

for  1  in  0  to  14  loop  —  Positive  value  conversion 

if  p(l)  =1'  then 
sum  :=  sum  +  2**1; 
end  if; 
end  loop; 
y  :  =  sum; 
end  if; 
end  bi_to_in;    ~  end  of  procedure  1 


procedure  in_to_bi  —  Second  procedure  that  changes  integer  to  bits 
(variable  m  :  in  integer; 
variable  n  :  out  bit_vector(15  downto  0))  is 
variable  temp_a  :  integer  :  =  0; 
variable  tempb  :  integer  :  =  0; 
variable  w  :  bit_vector(15  downto  0); 
begin 
if  m  <  0  then 

temp_a  :  =  -m;  --  Take  the  absolute  value  of  negative  values 
else 

temp_a  :  =  m; 
end  if; 
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for  i  in  14  downto  0  loop  --  Binary  conversion 

temp_b  :=  temp_a/(2**i); 
tempa  :  =  temp_a  rem  (2**i); 
if  (temp_b  =  1)  then 
w(i):=  T; 
else 

w(i):=  '0'; 
end  if; 
end  loop; 
if  m  >  0  then 

w(15)  :=  '0';  -  Assign  positive  sign  bit 

else 
w(15)  :=  '1';  --  Assign  negative  sign  bit 

for  k  in  0  to  14  loop 
if  w(k)  =  T  then 
for  k  in  0  to  13  loop      ~  Invert  negative  bits  to  2's  complement 
w(k+l)  :=  not  w(k+l); 
end  loop;  exit; 
end  if; 
end  loop; 
end  if; 

if  W(i4)  =  '0'  and  w(13)  =  '0'  and  w(12)  =  '0'  and  w(ll)  =  '0' 
and  w(10)  =  '0'  and  w(9)  =  '0'  and  w(8)  =  '0'  and  w(7)  =  '0' 
and  w(6)  =  '0'  and  w(5)  =  '0'  and  w(4)  =  '0'  and  w(3)  =  '0' 
and  w(2)  =  '0'  and  w(l)  =  '0'  and  w(0)  =  '0' 
then 

w(15)  :=  '0';  --  Avoid  negative  zero 

end  if; 


n  :  =  w; 
end  in_to_bi;  -  end  of  procedure  2 
end  packl;  ~  end  of  procedure 

This  VHDL  package  used  in  the  simulation  is  basically  similar  to  any  other  high- 
level  language  subroutine  involving  specific  shared  operations.  The  difference  here  is 
that  it  is  possible  to  gather  several  different  procedures  or  functions  together  in  one 
package.  The  packl  here  consists  of  two  procedures  ~  bi_to_in  and  in_to_bi.  Bi_to_in 
converts  the  16-bit  binary  numbers  (represented  in  2's  complement  notation)  into  positive 
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or  negative  integers.  The  in_to_bi  procedure  converts  the  positive  or  negative  integers 
back  to  2's  complement  16-bit  binary  numbers.  Note  that  in  the  2's  complement  number 
system  used  here,  there  are  only  16  bits  including  one  sign  bit.  In  overflow  situations, 
the  digits  that  overflow  will  be  truncated. 

C.      CLOCK  GENERATOR  MODULE  (CLOCKGE) 

The  block  diagram  of  the  "clock_ge"  is  shown  in  Figure  4. 
The  interface  connection  (port  map  in 
VHDL)  has  also  been  shown.  This  tells 
how  the  circuit  can  be  connected  to  the 
test  bench.  The  VHDL  source  code  of  the 


CLCK 
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CK 


Fig.  4   clockge  block  diagram 


clk.vhd  is  shown  below, 

entity  clock_ge  is  ~  Entity 
—  declaration 
port(CLCK  :inout  bit); 
end  clockge; 

architecture  clk_ctl  of  clock_ge  is  —  Architecture  declaration 
begin 
process(CLCK)  -  Process  declaration 

variable  I  :  integer  :  =  0; 
begin  -  Process  begin 

CLCK  <  =  not  CLCK  after  5  ns;   —  Switching  clock  generation 
I  :=  I  +  1; 

assert  I  <  =  80  —  Assertion  terminates  the  infinite  process 

report  "job  done" 
severity  Error; 
end  process;  -  End  of  process 
end  clk_ctl;  ~  End  of  architecture 

There  is  a  sensitivity  signal  "CLCK"  in  the  source  code  which  provides  the  clock 

for  all  the  circuits.  The  initial  value  of  CLCK  is  "0."  Its  value  is  changed  into  "1"  after 
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5  ns.  Since  a  process  in  VHDL  basically  is  an  infinite  loop,  it  is  necessary  to  use  an 
"assert"  instruction  to  terminate  the  process.  By  changing  a  counter  value  "I",  the  job 
can  be  terminated  appropriately  after  80  iterations. 

D.      PARALLEL  SHIFT  REGISTER  MODEL  (LOAD). 
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Fig.  5   Serial  load  parallel  shift  register  block  diagram 

Figure  5  shows  the  detailed  block  diagram  of  the  parallel  shift  register  (LOAD). 

The  source  code  in  VHDL  is  shown  below 

entity  LOAD  is 

port  (AI  :  in  bit_vector(15  downto  0);  B0,B1,B2,B3,B4,B5,B6,B7  . 

out  bit_vector(15  downto  0);CLK  :  in  bit); 
end  LOAD; 
architecture  BEH  of  LOAD  is 
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type  shift  is  array  (0  to  7)  of  bit_vector(15  downto  0); 
begin 
process 

variable  A  :  shift; 
variable  I, count  :  integer  :  =  0; 
begin 
wait  until  CLK'event  and  CLK  =  '1';  —  Clock  controls  the  timing 
for  count  in  0  to  7  loop 
wait  until  CLK'event  and  CLK  =  '1'; 
for  I  in  0  to  6  loop  ~  Push  input  values  down  to  correct  position 

A(I):=  A(I+1); 
end  loop; 
A(7):=  AI; 

if  (count  =  7)  and  (CLK'event  and  CLK  =  'l')  then  --  Output  data 
BO  <  =  A(7); 
Bl  <  =  A(6); 
B2  <  =  A(5); 
B3  <  =  A(4); 
B4  <  =  A(3); 
B5  <  =  A(2); 
B6  <  =  A(l); 
B7  <  =  A(0); 
end  if; 
end  loop; 

wait  on  AI,CLK;  ~  Process  activated  when  sensitivity  signal  changes 
end  process; 
end  BEH; 

The  input  16-bit  data  come  from  AI  column  by  column.  The  speed  of  the  input  data 

is  controlled  by  the  test  bench.  Note  that  the  first  data  that  appears  is  the  8th  pixel  value 

of  the  first  column.  In  other  words,  the  sequential  order  of  the  incoming   data  is  7,  6, 

5,...  0.    In  this  order,  the  data  is  pushed  down  into  the  correct  position,  and  the  1-D 

DCT  can  be  done  correctly.   After  the   1-D  DCT  computation  in  Figure  3,   the 

corresponding  spectral  coefficients  will  be  put  back  in  the  correct  order,i.e.,  0,  1,2,... 

7.  "LOAD"  module  parallel  outputs  the  data  to  the  second  circuit  "SHIFT"  after  eight 

clock  cycles  (count  =  7).  After  that,  it  processes  another  new  column  of  data. 
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E.      SHIFT-TWO-REGISTER  MODEL  (SHIFT). 
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Fig.  6   Shift  two  register  block  diagram 

The  block  diagram  for  SHIFT  is  shown  in  Figure  6.  There  is  the  second  clock 
generator  with  three  delay  gates.  Since  the  incoming  pixel  values  pass  through  the 
parallel  shift  register  (LOAD),  and  it  causes  a  delay  of  one  clock  cycle,  it  is  necessary 
to  compensate  for  this  latency  by  delaying  the  clock  which  triggers  the  shift-two-register 
(SHIFT).  Another  clock  which  runs  twice  as  fast  as  ck  has  been  used  to  trigger  the 
original  clock  passing  through  the  delay  line.  The  VHDL  source  code  of  this  faster  clock 
is  similar  to  the  previously  discussed  clock  generator  except  the  switching  period  is 
twice  as  fast.  The  assertion  time  for  termination  is  therefore  twice  as  long,  the  delay  line 
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consists  of  shift  registers.  The  VHDL  source  code  of  the  DELAY  and  the  shift  register 

is  as  follows 

entity  delay  is 
port(a  :  bit;b  :  out  bit;CLK  :  bit);  -Normal  clock  coming  in  from  port 
a 

end  delay; 

architecture  beh  of  delay  is 
begin 
process 

variable  x  :  bit; 
begin 
wait  until  CLK'event  and  CLK  =  '1';  —  Faster  clock  controls  timing 
x  :  =  a;  ~  Shifting  the  incoming  clock 
b  <=  x; 
wait  on  CLK, a; 
end  process; 
nd  beh; 


entity  shift  is 
port(bi0,bil,bi2,bi3,bi4,bi5,bi6,bi7  :  in  bit_vector(15  downto  0); 
bo0,bol,bo2,bo3,bo4,bo5,bo6,bo7  :  out  bit_vector(l  downto  0); 
CLK  :  in  bit);  -  Port  declaration,  eight  input  and  output 
end  shift; 

architecture  beh  of  shift  is 
begin 
process 
variable  I  :  integer  :  =  0;  ~  counter  as  well  as  index 
begin 
for  r  in  0  to  7  loop 
wait  until  CLK'event  and  CLK  =1'; 
bo0(0)  <  =  biO(I);  --  "q"  =  0  binary  weight 
boO(l)  <  =  biO(I+l);  --  "q"  =  1  binary  weight 
bol(0)  <  =  bil(I); 
bol(l)  <=  bil(I+l); 
bo2(0)  <  =  bi2(I); 
bo2(l)  <=  bi2(I+l); 
bo3(0)  <  =  bi3(I); 
bo3(l)  <=  bi3(I+l); 
bo4(0)  <  =  bi4(I); 
bo4(l)  <=  bi4(I  +  l); 
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bo5(0)  <  =  bi5(I); 

bo5(l)  <=  bi5(I+l); 

bo6(0)  <  =  bi6(I); 

bo6(l)  <=  bi6(I+l); 

bo7(0)  <  =  bi7(I); 

bo7(l)  <=  bi7(I+l); 

I  :  =  I  +  2;  --  increment  of  two 
end  loop; 

I  :  =  0;  ~  reset  the  counter  for  next  column  of  data 
wait  on  CLK,bi0,bil,bi2,bi3,bi4,bi5,bi6,bi7;  -  wait  for  new  data 
end  process; 
end  beh; 


The  data  are  input  to  the  shift  register  in  16-bit  words  and  output  in  2-bit  words. 
Note  that  the  counter  "I"  has  been  used  as  an  index  for  each  data  word.  Therefore,  a 
reset  (I  :  =  o)  is  necessary  after  each  column  of  words  are  done.  Otherwise,  the  index 
would  be  running  out  of  range,  giving  a  run  time  error  in  the  VHDL  simulation. 
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F.      2-BIT  ADDER/SUBTRACTOR  MODEL  (ADDSUB) 

The  2-bit  adder/ subtracter  module  is  shown  in  Figure  7.  The  "adsu"  VHDL  source 

code  is  shown  in  Appendix  A.  A  simple  flow 
chart  in  Figure  8  shows  the  behavior  described 
in  VHDL.  There  are  eight  2-bit  words  input 
into  this  circuit.  It  is  necessary  to  do  the 
"serial"  2-bit  addition  or  subtraction  according 
to  the  expanded  Eqs.  (30)  and  (31).  Since  the 

Fig.  7  2-bit  add/sub  block  diagram       incoming    data    have    been    Presented    in 


•cO- 

•0 

d 

M 

—  ooO 

wl— 

•1 

01 

—  001 

«£— 

•2 

02 

—  oo2 

•OS— 

■9 

M 

— .  ood 

»C4— . 

•4 

M 

—  <»4 

m6  — 

•B 

M 

__oaS 

m«— i 

m 

bt 

—  <** 

■•7— 

"ax*" 

^-oo7 

I 

I 

ok 

«*■ 

31 


Fig.  8    "adsu"  flow  chart 


2 'complement  notation,  2's  complement 
addition  or  subtraction  should  be  used.  On 
the  other  hand,  the  2-bit  serial  operation 
should  consider  carriers  generated  f|  :.< 
previously.  In  other  words,  the  first  2-bit 
addition/subtraction  might  generate  a 
carrier.  This  carrier  must  carry  on  to  the 

next  2-bit  add/sub  computation.  The  simplest  way  to  solve  this  problem  is  using  a  2-bit 
adder  accompanied  by  a  register  handing  the  carrier  bit  for  the  next  addition/subtraction. 
For  the  subtraction  case,  it  is  necessary  to  convert  the  subtrahend  into  2's  complement 
notation  and  then  use  the  same  2-bit  adder  to  accomplish  the  computation.  What  has  been 
done  here  is  to  convert  the  subtrahend  into  l's  complement  first  and  then  add  it  to  "1" 
at  the  very  first  subtraction.  The  incoming  subtrahend  is  just  converted  into  l's 
complement  notation  and  the  adder  takes  care  of  the  "1"  addition.  In  this  way,  the  serial 
subtraction  is  accomplished.  There  are  four  2-bit  adders  and  four  2-bit  subtracters  in  the 
source  code.  The  "cr"  bit  sets  the  adder  carry  at  the  beginning  to  zero  and  the  "st"  bit 
sets  the  subtracter  carry  to  "  1 " .  Later  on,  the  adder/ subtracter  will  take  care  of  the  carry 
by  itself.  For  the  convenience  of  notation,  the  incoming  two  2-bit  data  and  the  carrier 
bit  have  been  combined  into  a  5 -bit  word,  and  the  addition  is  done  in  the  2 -bit  adder 
block.  There  will  be  more  explanation  as  to  how  the  2-bit  adder  block  is  formed  in  the 
later  discussion. 
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G.      SHIFT      REGISTER 
MODEL  (REG) 

The  shift  register  block 
diagram  is  shown  in  Figure  9. 
Signal  is  input  from  port  a  and 
output  to  port  b.  The  shift  register 
model  (REG)  VHDL  source  code 
is  shown  below 
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Fig.  9  shift  register  (reg)  block  diagram 


entity  reg  is 

port(a0,al,a2,a3,a4,a5,a6,a7  :  bit_vector(l  downto  0);  --  input  port 

b0,bl,b2,b3,b4,b5,b6,b7  :  out  bit_vector(l  downto  0);  ~  output  port 
CLK  :  bit); 
end  reg; 

architecture  beh  of  reg  is 
begin 
process 

variable  d0,dl,d2,d3,d4,d5,d6,d7  :  bit_vector(l  downto  0); 
begin 

dO  :  =  aO;  ~  Substitute  the  input  signal  in  a  variable 
dl  :=  al; 
d2  :=  a2; 
d3  :=  a3; 
d4  :=  a4; 
d5  :=  a5; 
d6  :=  a6; 
d7  :  =  a7; 
wait  until  CLK'event  and  CLK  =  '!';--  Clock  control 


bO  <  =  dO 
bl  <=  dl 
b2  <  =  d2 
b3  <  =  d3 
b4  <  =  d4 
b5  <  =  d5 


~  shift  the  variable  to  output  signal 
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b6  <  =  d6; 
b7  <  =  d7; 
wait  on  CLK; 
end  process; 
end  beh; 


This  circuit  is  the  simplest  one.  The  only  effect  of  this  code  is  to  use  a  signal 
assignment  statement  to  simulate  a  signal  buffer  causing  a  latency  period  of  one  clock 
cycle.  The  "wait  until  CLK'event  and  CLK  =  '1';"  statement  activates  the  timing 
control.  The  "wait  on  CLK"  statement  activates  the  process's  operation  whenever  the 
clock  changes  its  state. 

H.   READ  ONLY  MEMORY  MODEL  (ROM) 

Figure  10  shows  the  read  only  memory  block  diagram  .  The  VHDL  source  code 
is  included  in  Appendix  A.  There  are  eight  2-bit  words  input  to  this  block,  and  sixteen 
16  x  16  words  corresponding  to  the  1-D  DCT  multiplication  coefficients  being  read  out. 
The  outputs  of  four  adders  with  binary  weight  q  =  0's  and  q  =  l's  bits  form  two  4-bit 
address  bus  to  access  the  corresponding  ROM  multiplication  coefficients.  The  same 
situation  happens  for  subtraction.  There  are  sixteen  individual  ROM  locations  with 
sixteen  different  values  stored  in  them.  Why  there  are  sixteen  ROM  locations,  and  why 
there  are  sixteen  different  values  stored  in  them  are  discussed  in  detail  in  later  sections. 
Note  that  in  the  address  assignment  part  of  the  source  code,  the  order  of  the  addresses 
starts  from  eO,  el,  e2,  e3  and  ends  with  e7,  e6,  e5,  e4.  This  detailed  explanation  will 
also  be  given  in  later  discussion.  The  values  stored  in  the  individual  ROM  have  been 
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rom 


Fig.  10   ROM  block  diagram 

converted  from  the  sum  of  coefficients  "CV  to  16-bit  2's  complement  binary  values. 
The  values  of  "C^"  are  calculated  according  to  Eq.  (15)  and  Eq.  (16). 

I.       SHIFT  RIGHT  1-BIT  REGISTER  MODEL  (SHI1) 

Figure  1 1  shows  the  shift  right  1-bit  register  block  diagram.  Its  VHDL  source  code 
is  included  in  Appendix  A.  The  shift  right  1-bit  register  receives  sixteen  16-bit  words  and 
makes  the  right  shift  operation  in  eight  words.  It  outputs  the  resultant  sixteen  16-bit 
words  to  the  next  circuit.  The  only  difference  between  the  input  and  the  output  values 
is  that  the  odd  numbered  16-bit  words  have  been  shifted  right  1  bit  position.  At  the  same 
time,  the  original  16th  bit  (sign  bit)  of  each  odd  word  has  been  checked  and  replaced  by 
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shi  1 


Tck 

Fig.  11    Shil  register  block  diagram 

a  proper  bit  ("0"  or  "  1 " ,  depending  on  weather  it  has  a  positive  or  negative  value)  to 
properly  extend  the  binary  2's  complement  number. 

J.       ADDER/SUBTRACTOR-G  MODEL  (ADDG) 

Figure  12  shows  the  addg  block  diagram.  It  includes  one  control  circuit  and  five 
delay  gates.  The  control  circuit  enables  the  add_g  to  do  addition  or  subtraction.  The 
purpose  of  the  delay  line  is  to  compensate  for  signal  latency.  To  activate  the  add/subtract 
controller  at  the  right  time  when  signal  arrives  is  a  required  procedure. 
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Fig.  12   Addg  block  diagram 

The  addg  VHDL  source  code  as  well  as  the  control  and  the  delay  VHDL  source 

code  are  shown  below. 

entity  control  is 

port(CLK  :  bit;ct  :  out  bit); 
end  control; 

architecture  beh  of  control  is  ~  control 

begin 
process 
variable  i  :  integer  :  =  0; 
begin 

wait  until  CLK'event  and  CLK  =T;  -  Clock  triggers  the  circuit 
if  i  =  7  then 

ct  <  =  '  1 ' ;  -  output '  1 '  every  eight  clock  period 
else 

ct  <=  '0'; 
end  if; 
i  :=  i  +  1; 
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if  i  =  8  then 

i  :  =  0;  -  Reset  the  counter 
end  if; 
end  process; 
end  beh; 

entity  delay  10  is 

port(a  :  bit;b  :  out  bit;CLK  :  bit); 
end  delay  10; 

architecture  beh  of  delay  10  is  —  delay 

begin 
process 
variable  x  :  bit; 
begin 

wait  until  CLK'event  and  CLK  =  T; 
x  :=  a; 
b  <  =  x; 
wait  on  CLK, a; 
end  process; 
end  beh; 

use  work. pack  1. all;  ~  All  the  functions  in  packl  are  used 
entity  add_g  is 
Port(al,a2,a3,a4,a5,a6,a7,a8,a9,al0,all,al2,al3,al4,al5,al6 : 
bit_vector(15  downto  0);  —  input  port 

M,b2,b3,b4,b5,b6,b7,b8:  out bit_vector(  15  downto 0);  -output port 
CLK,as  :  bit); 
end  add_g; 

architecture  beh  of  addg  is 
begin 
process 
variable  Xl,x2,x3,x4,x5,x6,x7,x8,x9,xl0,xll,xl2,xl3,xl4,xl5,xl6, 

nl,n2,n3,n4,n5,n6,n7,n8  :  bit_vector(15  downto  0); 
variable  yl,y2,y3,y4,y5,y6,y7,y8,y9,yl0,yll,yl2,yl3,yl4,yl5,yl6, 

ml,m2,m3,m4,m5,m6,m7,m8  :  integer  :=  0; 
begin 
wait  until  CLK'event  and  CLK  =  T; 
xl  :=  al;   x2  :=  a2;   x3  :=  a3;   x4  :=  a4;  --  input  values 
x5  :=  a5;   x6  :=  a6;   x7  :=  a7;   x8  :=  a8; 
x9:=a9;   xlO  :=  alO;   xll  :=  all;   xl2  :=  al2; 
xl3:=al3;   xl4  :=  al4;   xl5  :=  al5;   xl6:=al6; 
—  Procedure  call  to  do  integer  conversion 
bi_to_in(xl,yl);bi_to_in(x2,y2);bi_to_in(x3,y3);bi_to_in(x4,y4); 
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bi_to_in(x5,y5);bi_to_in(x6,y6);bi_to_in(x7,y7);bi_to_in(x8,y8); 
bi_to_in(x9,y9);bi_to_in(xl0,yl0);bi_to_in(xll,yll);bi_to_in(xl2,yl2); 

bi_to_in(xl3,yl3);bi_to_in(xl4,yl4);bi_to_in(xl5,yl5);bi_to_in(xl6,yl6); 
if  as  =   0'  then 

ml  :=  yl  +  y2;  m2  :=  y3  +  y4;  m3  :=  y5  +  y6;  m4  :  =  y7  +  y8; 
m5  :=  y9  +  ylO;  m6  :=  yll  +  yl2;  ml  :=  yl3  +  yl4;  m8  :=  yl5  + 
yl6; 

else  --  Control  gives  the  subtraction  instruction 

ml  :=  yl  -  y2;  m2  :=  y3  -  y4;  m3  :=  y5  -  y6;  m4  :=  y7  -  y8; 

m5  :=  y9-yl0;  m6  :=  yll  -  yl2;  m7  :=  yl3  -  yl4;  m8  :=  yl5  -  yl6; 

end  if; 

—  Procedure  call  to  do  binary  conversion 

in_to_bi(ml,nl);  in_to_bi(m2,n2);  in_to_bi(m3,n3);  in_to_bi(m4,n4); 
in_to_bi(m5,n5);  in_to_bi(m6,n6);  in_to_bi(m7,n7);  in_to_bi(m8,n8); 
bl  <  =  nl;   b2  <  =  n2;   b3  <  =  n3;   b4  <  =  n4; 
b5  <  =  n5;   b6  <  =  n6;   b7  <  =  n7;   b8  <  =  n8; 
wait  on  al,a2,a3,a4,a5,a6,a7,a8,a9,al0,all,al2,al3,al4,al5,al6,CLK; 
end  process; 
end  beh; 

The  control  is  triggered  by  the  clock,  and  an  output  of  the  control  bit  "ct"  is 
generated.  On  the  8th  clock  period,  the  "ct"  becomes  "1"  but  equals  "0"  otherwise.  The 
delay  is  also  triggered  by  the  clock.  It  receives  one  bit  and  outputs  the  same  bit  one  clock 
cycle  later. 

Add_g  has  sixteen  16-bit  word  inputs  and  eight  16-bit  word  outputs.  It  performs 
16-bit  addition  or  subtraction.  As  discussed  previously,  it  is  rather  complicated  to  build 
up  a  16-bit  adder/subtractor  in  a  VHDL  structural  approach.  The  e  est  way  is  to 
convert  the  16-bit  binary  words  into  integers.  In  this  way,  "use  work.packl.all"  at  the 
beginning  of  the  entity  has  to  be  declared,  in  order  to  call  the  "bi_to_in"  procedure  in 
packl.  "Work"  represents  the  working  library  used,  and  "packl.all"  represents  all  the 
packages  being  used.  After  the  conversion  of  binary  values  to  integer  values,  addition  or 
subtraction  was  done  according  to  the  control  input  "as".  The  results  then  are  converted 
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back  to  binary  values  again  for  output.  Of  course,  the  timing  is  always  synchronized  by 
the  clock. 


K.      SHIFT  REGISTER-H  MODEL 
(REG_H) 

The  reg_h  block  diagram  is 
shown  in  Figure  13.  It  functions  just 
like  "reg",  except  "reg"  handles  2-bit 
words  and  "reg_h"  handles  16-bit 
words.  The    VHDL  source  codes  are 
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Fig.  13   Shift  register_g  block  diagram 


the  same  except  for  the  declaration  of  the  length  of  bit- vectors. 

L.       16-BIT  ADDERI  MODEL  (ADD_I) 

Figure  14  shows  the  block  diagram  of  the  16_bit  adder  (ADD_I).  ADD_I  and 

ADD_G  are  basically    the  same.    ADD_I  does  not  have  the  "as"  control  bit  or  "if 

instruction  in  the  VHDL  source  code  to  do  the  subtraction.  Another  big  difference  is 

that  ADD_I  is  not  triggered  by  the  clock.  It  adds  up  the  two  16-bit  inputs  with  no  delay. 

It  does  integer  addition  with  the  procedures  in  packl  also.  The  two  inputs  come  from 

REG_H  and  the  feedback  output  from  the  SHI_2,  which  shifts  the  result  to  the  right  by 

2  bits.  This  is  shown  in  Figure  2.  The  VHDL  source  code  for  ADDI  is  shown  below 

use  work. packl. all; 
entity  addi  is 
Port(al,a2,a3,a4,a5,a6,a7,a8,a9,al0,all,al2,al3,al4,al5,al6: 

bit_vector(15  downto  0); 

M,b2,b3,b4,b5,b6,b7,b8  :  out  bit_vector(15  downto  0)); 
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Fig.  14   16-bit  addi  block  diagram 


end  add_i; 

architecture  beh  of  addi  is 
begin 
process 
variable  Xl,x2,x3,x4,x5,x6,x7,x8,x9,xl0,xll,xl2,xl3,xl4,xl5,xl6, 

nl,n2,n3,n4,n5,n6,n7,n8  :  bit_vector(15  downto  0); 
variable  yl,y2,y3,y4,y5,y6,y7,y8,y9,yl0,yll,yl2,yl3,y  14, yl5,yl6, 

ml,m2,m3,m4,m5,m6,m7,m8  :  integer  :=  0; 
begin 

xl  :=  al;    x2  :=  a2;   x3  :=  a3;   x4  :=  a4; 
x5  :=  a5;   x6  :=  a6;   x7  :=  a7;   x8  :=  a8; 
x9:=a9;   xlO  :=  alO;   xll  :=  all;   xl2  :=  al2; 
xl3  :=  al3;   xl4  :=  al4;   xl5  :=  al5;   xl6  :=  al6; 
bi_to_in(xl,yl);bi_to_in(x2,y2);bi_to_in(x3,y3);bi_to_in(x4,y4); 
bi_to_in(x5,y5);bi_to_in(x6,y6);bi_to_in(x7,y7);bi_to_in(x8,y8); 
bi_to_in(x9,y9);bi_to_in(xl0,yl0);bi_to_in(xll,yll); 
bi_to_in(xl2,yl2); 

bi_to_in(xl3,yl3);bi_to_in(xl4,yl4);bi_to_in(xl5,yl5); 
bi_to_in(xl6,yl6); 
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+  yl6; 


ml  :=  yl  +  y2;  m2  :  =  y3  +  y4;  m3  :  =  y5  +  y6;  m4  :=  y7  +  y8; 
m5  :=  y9  +  ylO;  m6  :=  yll  +  yl2;  m7  :=  yl3  +  yl4;  m8  :=  yl5 

in_to_bi(ml,nl);  in_to_bi(m2,n2);  in_to_bi(m3,n3);  in_to_bi(m4,n4); 
in_to_bi(m5,n5);  in_to_bi(m6,n6);  in_to_bi(m7,n7);  in_to_bi(m8,n8); 
bl  <  =  nl;   b2  <  =  n2;   b3  <  =  n3;   b4  <  =  n4; 
b5  <  =  n5;   b6  <  =  n6;   b7  <  =  n7;   b8  <  =  n8; 
wait  on  al,a2,a3,a4,a5,a6,a7,a8,a9,al0,all,al2,al3,al4,al5,al6; 
end  process; 
end  beh; 


M.     SHIFT  RIGHT  2-BIT  REGISTER  MODEL  (SHI  2) 
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Fig.  15   Shift  right  2-bit  register  block  diagram 

The  shift  right  2-bit  register  (shi_2)  block  diagram  is  shown  in  Figure  15.  It 
includes  another  clock  generator  running  two-times  faster  to  trigger  the  delay  unit  which 
delays  the  normal  clock  by  one  period.  It  has  another  clear  line  (clr)  from  the  test  bench 
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that  clears  the  register  every  eight  clock  cycles.  The  VHDL  source  code  of  SHI2  is 
shown  in  Appendix  B. 

The  SHI_2  model  has  eight  16-bit  word  inputs  from  ADD_I  and  has  sixteen  16-bit 
word  outputs.  The  input  values  have  been  checked  for  the  sign  bit,  and  the  SHI_2  shifts 
the  data  2  bits  to  the  right  in  proper  2's  complement  representation.  There  are  eight 
blocks  in  the  SHI_2  module.  The  results  are  updated  and  fed  back  to  ADD_I  module  to 
perform  an  addition  with  the  incoming  data  values.  In  every  8th  clock  cycle,  the  results 
are  parallel  shifted  to  the  "parallel  load  serial  shift"  register  (RESULT).  During  the  same 
cycle,  the  shift  right  2-bit  results  are  cleared,  and  the  SHI_2  is  ready  for  the  next  column 
operation. 

N.      PARALLEL  LOAD  SERIAL  SHIFT  REGISTER  MODEL  (RESULT) 

The  block  diagram  of  the  parallel  load  serial  shift  register  (RESULT)  is  shown  in 

Figure  16.  There  are  eight  inputs  from  SHI_2;  RESULT  puts  out  only  one  value  at  a 

time.  The  VHDL  source  code  of  RESULT  is  shown  below, 

entity  result  is 
port(al,a2,a3,a4,a5,a6,a7,a8  :  bit_vector(15  downto  0); 
k  :  out  bit_vector(15  downto  0);CLK  :  bit); 
end  result; 

architecture  beh  of  result  is 
type  r  is  array  (0  to  7)  of  bit_vector(15  downto  0); 
begin 
process 

variable  x  :  r; 
begin 

x(0)  :=  al;  x(l)  :=  a2;  x(2)  :=  a3;  x(3)  :=  a4; 
x(4)  :=  a5;  x(5)  :=  a6;  x(6)  :=  a7;  x(7)  :=  a8; 
for  i  in  0  to  7  loop 
wait  until  CLKevent  and  CLK  =  T; 
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Fig.  16  Parallel  shift  serial  output  register  block  diagram 

k  <  =  x(i); 
end  loop; 

wait  on  al,a2,a3,a4,a5,a6,a7,a8,CLK; 
end  process; 
end  beh; 


Eight  16-bit  words  are  input  into  RESULT  every  8th  clock  cycle.  They  are  pushed 
out  one  value  at  a  time  at  every  clock  period.  After  all  eight  values  have  been  output, 
new  values  are  fed  in  again  for  the  next  cycle. 
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TEST  BENCH 
di       dr      set     cr      p 

1     1     1    I    t 

DESIGN  CIRCUIT 

O.      TEST  BENCH 

The  Test  bench  block  diagram  is 
shown  in  Figure  17.  It  actually 
includes  all  the  intermediate  signals, 
the  control  signals,  and  the  input  and 
output  signals.  The  VHDL  source  code 
for  the  test  bench  is  shown  in 
Appendix  B.   All  the  components  used 

in  the  system  have  been  declared  and  Fi8«  17  Block  dl^r&m  of  Test  Bench 
instantiated.  The  signals  used  for  the  simulation  are  declared  also.  Configuration 
statement  binds  all  the  components  to  the  test  system.  The  input  pixel  values  are  fed  into 
the  system  through  "di",  and  it  is  simulated.  The  results  of  the  simulation  are  collected 
by  signal  "p".  A  table  of  the  simulation  results  "p"  is  generated  and  analyzed  to  see  if 
the  design  is  functioning  correctly. 


45 


V.    SIMULATION  OUTPUT  ANALYSIS  AND  EXPERIENCE 

A.      FORMATION  OF  ROM  STORAGE  VALUES 

As  discussed  before,  there  are  only  sixteen-word  ROM  for  each  multiplication 
coefficient  due  to  the  symmetry  in  DCT.  The  coefficients  can  be  calculated  according  to 
Eq.  (15)  and  Eq.  (16). 

Table  I:   Multiplication  Coefficients 


m  =  0 

m  =  1 

m  =  2 

m  =  3 

C*   k 

=  even 

A  =  Yu+Yn 

B  =  Ya  +  Ym 

C  =  Ya+Ya 

D  =  Y^Yi4 

U 

.3535533905 

.3535533905 

.3535533905 

0.3535533905 

k  =  0 

.4619397662 

.1913417161 

-.1913417161 

-.4619397662 

k  =  2 

.3535533905 

-.3535533905 

-.3535533905 

.3535533905 

k  =  4 

.1913417161 

-.4619397662 

.4619397662 

-.1913417161 

k  =  6 

Qu,  k 

=  odd 

A  =  Yi0-Yi7 

B  =  Ya-Yi6 

C  =  Yn-Ys 

d  =  ra-7* 

V 

.4903926402 

.4157348061 

.2777851165 

.0975451610 

k  =  1 

.4157348061 

-.097545161 

-.4903926402 

-.2777851165 

k  =  3 

.2777851165 

-.4903926402 

.09754516101 

.4157348061 

k  =  5 

.0975451610- 

.2777851165 

.4157348061 

-.4903926402 

k  =  7 

Since  N  =  8,  the  expanded  equation  of  Eq.  (30)  and  Eq.  (31)  can  be  derived  as  in  Table 
I  after  substituting  the  proper  index  (m,  k).  The  labels  U0,  U2,  ...,  V7  are  included  in 
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the  table  for  better  understanding.  Labels  A,  B,  C,  D  stand  for  bit  patterns.  For  example, 
ifA  =  1,  B  =  0,  C  =  1,  D  =  1,  then  the  values  in  column  1,  3,  and  4  should  be 
summed  up  to  get  the  corresponding  multiplication  coefficient  sum  stored  in  the  ROM. 
The  bit  pattern  in  the  circuit  has  two  weighted  groups  (LSB  group  q  =  0's,  and  MSB 
group  q  =  l's).  The  coefficient  values  for  these  two  patterns  are  exactly  the  same. 
Therefore,  there  are  only  8  x  16  =  128  different  coefficient  sums  stored  in  ROM. 

One  very  important  fact  must  be  stressed.  Are  the  values  stored  in  the  ROM 
decimal  numbers?  The  answer  is  obviously  no.  The  values  are  stored  in  the  ROM  as 
binary  numbers.  How  can  these  summed  decimal  numbers  be  converted  into  binary 
numbers?  Upon  inspection  of  Table  I,  it  is  noted  that  the  largest  possible  decimal 
number  generated  is  not  greater  than  2.  The  smallest  possible  decimal  number  generated 
is  not  lesser  than  -2.  As  stated  before,  the  number  system  used  here  is  16-bit  2's 
complement  number.  Therefore,  one  sign  bit,  one  digit  bit,  and  fourteen  fraction  bits  are 
chosen  to  represent  the  binary  numbers  stored  in  the  ROM.  All  the  decimal  coefficients 
calculated  according  to  the  specific  bit  pattern  A,  B,  C,  D  have  to  be  converted  into 
binary  2's  complement  16-bit  numbers.  This  conversion  operation  is  carried  out  with  the 
help  of  a  small  program  written  in  Matlab  listed  in  Appendix  C.  The  actual  values  stored 
in  the  ROM  are  shown  in  the  ROM  VHDL  source  code. 

B.        SIMULATION  AND  TESTING  IMAGE  PATTERN  (I) 

The  first  image  pattern  being  used  is  shown  in  Figure  17.  It  is  a  two-dimensional 
cosine  wave  with  intensity  varied  along  x-axis.  The  pixel  value  can  be  represented  in  128 
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Fig.  18   Pattern  (1)8x8  image  block 

levels.  Therefore,  the  pixel  value  of  each  point  in  this  image  can  be  represented  from  the 
following  formula 


/  (x,  y)  =  [cos(2*fr  +  2*/p0  +  1]  /  2  x  128 


(32) 


where/,  =  1/4,  fy  =  0. 
After  substituting  the  corresponding  index  (x,  y)  in  Figure  17  into  Eq.  (32),  the  pixel 
values  represented  in  this  8x8  image  block  can  be  shown  in  Table  II.  The  12-bit  binary 
representations    of    decimal    numbers     128    and    64    are     "000010000000"    and 
"000001000000".  Converting  the  values  in  Table  II  into  12-bit  binary  numbers  and  taking 
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Table  II:  8  x  8  image  pixel  values  of  Pattern  (I) 


y  = 

7 

128 

64 

0 

64 

128 

64 

0 

64 

y  = 

6 

128 

64 

0 

64 

128 

64 

0 

64 

y  = 

5 

128 

64 

0 

64 

128 

64 

0 

64 

y  = 

4 

128 

64 

0 

64 

128 

64 

0 

64 

y  = 

3 

128 

64 

0 

64 

128 

64 

0 

64 

y  = 

2 

128 

64 

0 

64 

128 

64 

0 

64 

y  = 
i 

128 

64 

0 

64 

128 

64 

0 

64 

y  = 

0 

128 

64 

0 

64 

128 

64 

0 

64 

x  =  0 

x  =  1 

x  =  2 

x  =  3 

x  =  4 

x  =  5 

x  =  6 

x  =  7 

them  column  by  column  into  the  1-D  DCT  VHDL  model  yields  the  corresponding  1-D 
DCT  spectral 

coefficients  (in  Hex)  as  listed  in  Table  III.  The  same  decimal  values  in  Table  II  has  also 
been  put  into  a  1-D  DCT  subroutine  for  calculation  which  is  in  a  image  processing 
library  called  spider.  The  result  is  shown  in  Table  IV. 

Due  to  the  time  limitations,  the  attempt  to  carry  out  the  transpose  of  the  1-D  DCT 
coefficients  in  VHDL  behavior  models  was  not  made.  However,  manual  transpose  is 
done  instead.  Transposed  1-D  DCT  coefficients  of  pattern  (I)  in  VHDL  simulation  is 
shown  in  Table  V.  The  values  in  Table  V  are  converted  again  into  binary  numbers  and 
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Table  III:  1-D  DCT  spectral  coefficients  of  Pattern  (I)  in  VHDL  simulation 


0B50 

05A8 

0000 

05A8 

0B50 

05A8 

0000 

05A8 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

Table  IV:  1-D  DCT  coefficients  of  pattern  (I)  using  Spider  Subroutine 


362.03 

181.01 

0 

181.0 

1 

362.03 

181.01 

0 

181.01 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

input  column  by  column  into  the  16-bit  1-D  DCT  VHDL  model  to  accomplish  the  2-D 
DCT  operation.  The  2-D  DCT  spectral  coefficients  which  have  been  transposed  back  in 
the  VHDL  simulation  are  shown  in  Table  VI.  The  1-D  DCT  operations  in  the  VHDL 
simulation  is  based  on  integer  calculation.  In  order  to  prove  that  the  1-D  DCT  VHDL 
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Table  V:  Transposed  1-D  DCT  coefficients  of  pattern  (I)  in  VHDL  simulation 


0B50 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

05A8 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

00000 

05A8 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0B50 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

05A8 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

05A8 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

Table  VI:    2-D  DCT  spectral  coefficients  of  pattern  (I)  in  VHDL  simulation 


01F7 

005F 

0000 

00C4 

00FF 

FF7C 

0000 

FFED 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

simulation  result  is  correct,  the  values  in  Table  V  are  converted  into  integers  and  are 
shown  in  Table  VII.  The  values  in  Table  VII  is  again  calculated  column  by  column  using 
the  spider  1-D  DCT  subroutine.  Its  2-D  DCT  spectral  coefficients  are  transposed  and 
shown  in  Table  VIII. 
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Table  VII:    Table  V  in  integer  values 


2896 

0 

0 

0 

0 

0 

0 

0 

1448 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1448 

0 

0 

0 

0 

0 

0 

0 

2896 

0 

0 

0 

0 

0 

0 

0 

1448 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1448 

0 

0 

0 

0 

0 

0 

0 

Table  VIII:    2-D  DCT  spectral  coefficients  of  pattern  (I)  using  Spider  Subroutine 


4095.5 

768.54 

0 

1573.0 

2047.7 

-1051.0 

0 

-152.8 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

To  ensure  that  the  1-D  DCT  structural  calculation  in  the  VHDL  simulation  is 
correct  ,  direct  1-D  DCT  calculation  on  a  calculator  is  also  carried  out  based  on 
Eq.(15),  and  Eq.(16).  Equations  (33)  and  (34)  show  the  calculation  example  for  k  =  0 
and  k  =  1. 
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C(0)  =  —(2896  +  1448+0  +  1448+2896+1448+0  +  1448) 


(33) 


C(l)  = 


N 


-(2896COS—  +  1448cos—  +0  +  1448cos— 
8       16       16         16 


+2896cos—  +  1448COS—  +0  +  1448cos— 
16        16  16 


(34) 


The  results  using  this  approach  are  listed  in  Table  IX.  Note  that  the  results  of  Table  VIII 

and  Table  IX  are  very  close. 

Table  IX:   2-D  DCT  coefficients  of  pattern  (I)  using  direct  calculation 


4095.5 

768.59 

0 

1537.0 

2047.7 

-1051.0 

0 

-152.8 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

It  is  also  necessary  to  trace  the  operation  in  the  VHDL  structural  models  shown  in 
Figure  2.  To  understand  the  structural  operation  and  calculation  of  the  1-D  DCT  in  the 
VHDL  simulation  in  more  detail,  a  manual  derivation  and  calculation  are  carried  out  for 
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Table  X    16-bit  binary  number  representation  of  table  (V) 


Slice(O) 

0000101101010000 

0 

0 

0 

0 

0 

0 

0 

Slice(l) 

0000010110101000 

0 

0 

0 

0 

0 

0 

0 

Slice(2) 

0000000000000000 

0 

0 

0 

0 

0 

0 

0 

Slice(3) 

0000010110101000 

0 

0 

0 

0 

0 

0 

0 

Slice(4) 

0000101101010000 

0 

0 

0 

0 

0 

0 

0 

Slice(5) 

0000010110101000 

0 

0 

0 

0 

0 

0 

0 

Slice(6) 

0000000000000000 

0 

0 

0 

0 

0 

0 

0 

Slice(7) 

0000010110101000 

0 

0 

0 

0 

0 

0 

0 

the  purpose.  First  the  values  in  Table  V  need  to  be  converted  into  binary  numbers,  which 
are  shown  in  Table  X.  It  is  clear  that  only  one  column  of  Table  X  is  not  zero.  Therefore, 
there  is  only  one  column  of  the  1-D  DCT  that  needs  computation.  The  values  in  the  first 
column  are  input  into  the  1-D  DCT  VHDL  model  which  yields  the  serial  2-bit 
addition/ subtraction  results  as  shown  in  Table  XI. 

The  first  column  in  Table  XI  shows  how  the  2-bit  addition/subtraction  is  done.  The  first 
row  on  the  top  represents  the  clock  cycle.  The  rows  in  the  upper-half  (U)  correspond 
"k"  equal  to  even  numbers,  and  the  rows  in  the  lower-half  (V)  correspond  "k"  equal  to 
odd  numbers.  Each  half  column  has  four  bits,  forming  a  bus  to  address  the  corresponding 
ROM  coefficients.  For  example,  at  the  first  clock  cycle,  there  are  two  4-bit  buses.  The 
four  least  significant  bits  (LSB)  form  an  "  ABCD"  corresponding  to  "0000"  bus  to  address 
the  "U00"  (refer  to  Fig.  2)  ROM  value.  This  yields  the  value  "0000000000000000"  as 
output.    The   MSBs   of  the    first   clock  cycle   addresses    the    "U01"    ROM   value 
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Table  XI:  Serial  2-bit  addition/subtraction  output 


dock- 
8 

dock- 
7 

dock- 
6 

dock- 
5 

dock- 
4 

dock- 
3 

dock- 
2 

dock- 
1 

Slice  (0*7) 

00 

01 

00 

00 

11 

11 

10 

00 

A 

U 

Slice  (1+6) 

00 

00 

01 

01 

10 

10 

10 

00 

B 

Slice  (2+ 5] 

00 

00 

01 

01 

10 

10 

10 

00 

C 

Slice  (3 +4 

00 

01 

00 

00 

11 

11 

10 

00 

D 

Slice  (3  -4] 

11 

11 

01 

10 

01 

01 

10 

00 

D 

V 

Slice  (2-5) 

11 

11 

01 

10 

01 

01 

10 

00 

C 

Slice  (1-6) 

00 

00 

10 

01 

10 

10 

10 

00 

B 

Slice  (0-7) 

00 

00 

10 

01 

10 

10 

10 

00 

A 

"0000000000000000"  out.  It  then  adds  up  with  the  1-bit  right  shifted  value  of  "U00". 
This  result  is  stored  in  REG_H  and  then  2-bit  right-shifted  in  the  SHI_2  register.  The 
first  clocked  2-bit  right-shifted  word  is  then  fed  back  to  ADD_I  and  added  to  the  second 
clocked  result  "0101101010000010".  The  procedure  of  getting  this  second  clocked  result 
is  just  the  same  as  that  of  getting  the  first  clocked  result.  The  summation  of  the  first  2-bit 
right-shifted  number  and  the  second  clocked  result  "010110101000010"  is  then  shifted 
right  2  bits,  yielding  "0001011010100000".  This  value  is  then  added  to  the  third  clocked 
result  "0111000100100010",  yielding  "1000011111000010".  This  process  goes  on 
serially  until  the  8th  clock  cycle  is  reached.  The  addressed  output  ROM  value  of  the  MSB 
of  the  8th  clock  cycle  "0000000000000000"  is  subtracted  from  the  right-shifted  1-bit 
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Table  XII  2-D  DCT  coefficients  of  pattern  (I)  using  manual  calculation 


00000001111I11H 

0000000001011111 

000000001 10001 00 

00C0BOK111 11111 

1111111111111100 

1111111111101100 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

addressed  ROM  value  of  the  LSB  of  the  8th  clock  cycle  "0000000000000000".  This  result 
is  then  added  to  the  previous  accumulated  7  clocked  values,  yielding 
"0000011111111111".  This  final  result  is  then  right  shifted  2  bits,  yielding 
"00000001 111111 11"  and  output  as  the  first  pixel  2-D  DCT  coefficient  of  the  first 
column.  8x8  image  block  of  the  2-D  DCT  coefficients  pattern  I  using  structural 
manual  calculations  are  shown  in  Table  XII.  The  detailed  calculation  procedure  is  listed 
in  Appendix  D.  Note  that  the  summation  of  the  accumulated  two  clocked  values  and  the 
third  clocked  result  generates  an  overflow.  This  overflow  will  eventually  generate  a 
negative  value  when  right-shifted  2  bits.  This  is  a  inherent  drawback  of  using  16-bit 
integers  arithmetics. 


56 


C.   SIMULATION  AND  TEST  OF  IMAGE  PATTERN  (II) 

Image  pattern  II  is  equal  to  image  pattern  I  rotated  by  45°.  The  following  formula 
was  used  to  calculate  each  pixel  value. 


1 


1 


f(x,y)  =  [cos(2ti(^)7x  +  2n(-^)Ty)  +  1]  /  2  x  128 


(35) 


Table  XHI:    8x8  image  block  pixel  values  of  pattern  (II) 


7 

64 

128 

64 

0 

64 

128 

64 

0 

6 

0 

64 

128 

64 

0 

64 

128 

64 

5 

64 

0 

64 

128 

64 

0 

64 

128 

4 

128 

64 

0 

64 

128 

64 

0 

64 

3 

64 

128 

64 

0 

64 

128 

64 

0 

2 

0 

64 

128 

64 

0 

64 

128 

64 

1 

64 

0 

64 

128 

64 

0 

64 

128 

0 

128 

64 

0 

64 

128 

64 

0 

64 

0 

1 

2 

3 

4 

5 

6 

7 

The  8x8  image  block  pixel  values  of  pattern  II  represented  in  decimal  numbers  are 
shown  in  Table  XIII.  The  2-D  DCT  of  pattern  II  has  been  calculated  in  two  ways, 
VHDL  simulation  and  spider  subroutine.  Using  VHDL  simulation  first,  Table  XIII  is 
converted  into  binary  numbers  and  is  input  column  by  column  into  the  VHDL  1-D  DCT 
test  bench.  Its  1-D  DCT  coefficients  is  shown  in  Table  XIV.  For  2-D  DCT,  the  values 
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Table  XIV:    1-D  DCT  coefficients  of  pattern  (II)  using  VHDL  simulation 


7 

016A 

016A 

016A 

016A 

016A 

016A 

016A 

016A 

6 

FFBB 

0043 

0043 

FFBB 

FFBB 

0043 

0043 

FFBB 

5 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

4 

FF74 

008A 

008A 

FF74 

FF74 

008A 

008A 

FF74 

3 

00B5 

00B5 

FF4A 

FF4A 

00B5 

00B5 

FF4A 

FF4A 

2 

005D 

FFA2 

FFA2 

005D 

005D 

FFA2 

FFA2 

005D 

1 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0 

000D 

FFF2 

FFF2 

000D 

000D 

FFF2 

FFF2 

000D 

0 

1 

2 

3 

4 

5 

6 

7 

in  Table  XIV  are  then  transposed  manually,  and  the  results  are  input  into  the  16-bit 
VHDL  1-D  DCT  test  bench.  The  2-D  DCT  spectral  coefficients  for  pattern  II  in  VHDL 
simulation  are  listed  in  Table  XV. 

Table  XV:   2-D  DCT  coefficients  of  pattern  (II)  using  VHDL  simulation 


7 

005F 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

6 

FFFF 

0000 

0000 

0000 

FFE7 

0000 

0000 

0000 

5 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

4 

FFFF 

0000 

0000 

0000 

FFCE 

0000 

0000 

0000 

3 

F555 

0017 

0000 

0031 

0000 

FFDE 

0000 

FFFC 

2 

FFFF 

0000 

0000 

0000 

0020 

0000 

0000 

0000 

1 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0000 

0 

FFFF 

0000 

0000 

0000 

0005 

0000 

0000 

0000 

0 

1 

2 

3 

4 

5 

6 

7 
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Table  XVI:    Pattern  II  1-D  DCT  coefficients  using  Spider  Subroutine 


7 

181.0 

181.0 

181.0 

181.0 

181.0 

181.0 

181.0 

181.0 

6 

-33.97 

33.97 

33.97 

33.97 

33.97 

33.97 

33.97 

33.97 

5 

0 

0 

0 

0 

0 

0 

0 

0 

4 

-69.53 

69.53 

69.53 

69.53 

69.53 

69.53 

69.53 

69.53 

3 

90.51 

90.51 

-90.51 

90.51 

90.51 

90.51 

-90.51 

90.51 

2 

46.46 

-46.46 

-46.46 

46.46 

46.46 

46.46 

-46.46 

46.46 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

6.757 

-6.757 

-6.757 

6.757 

6.757 

6.757 

-6.757 

6.757 

0 

1 

2 

3 

4 

5 

6 

7 

1-D  DCT  subroutine  in  Spider  is  used  to  double  check  the  VHDL  simulation 
result.  Values  in  Table  XIII  are  calculated  column  by  column,  and  its  result  is  listed  in 
Table  XVI  This  result  is  compared  with  that  of  Table  XIV  for  verification. 

2-D  DCT  floating  point  calculation  is  also  used  to  check  the  VHDL  simulation. 
Again  for  the  same  reason  of  comparison,  values  in  Table  XIV  are  chosen  and  converted 
into  integers.  After  the  Hex-integer  conversion,  these  values  are  transposed  again  and 
calculated  by  1-D  DCT  Spider  subroutine  column  by  column.  The  results  are  shown  in 
Table  XVII. 
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Table  XVII:   2-D  DCT  coefficients  of  pattern  (II)  using  floating  point  calculation 


7 

1023.9 

0 

0 

0 

0 

0 

0 

0 

6 

-2.828 

0 

0 

0 

-192.3 

0 

0 

0 

5 

0 

0 

0 

0 

0 

0 

0 

0 

4 

-2.828 

0 

0 

0 

-393.2 

0 

0 

0 

3 

-1.414 

192.7 

0 

394.4 

0 

-263.5 

0 

-38.33 

2 

-1.414 

0 

0 

0 

264.5 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

-1.414 

0 

0 

0 

38.18 

0 

0 

0 

0 

1 

2 

3 

4 

5 

6 

7 

D.      RESULT  ANALYSIS 

There  are  four  methods  being  used  to  prove  the  accuracy  of  the  VHDL  structural 
1-D  DCT  in  VHDL  simulation.  Comparing  Tables  VI,  VIII,  IX,  and  XII,  the 
similarities  among  them  are  obvious.  Tables  VIII  and  IX  are  almost  the  same  while 
Tables  VI  and  XII  need  to  be  converted  into  decimal  numbers  for  ease  of  comparison. 
Table  VI  needs  to  be  converted  into  16-bit  binary  values  first,  then  using  the  definition 
of  the  16-bit  binary  number  system  (1  sign  bit,  1  integer  and  14  fraction  bits)  to  convert 
the  binary  words  into  decimal  numbers. 

The  multiplication  factor  as  to  how  many  times  the  number  is  being  right-shifted 
here  is  217.  The  equivalent  integer  values  of  Table  VI  and  Table  XII  are  shown  in  Table 
XVIII  and  XIX.  Most  of  the  pixel  values  are  similar  to  those  in  Table  VIII  and  IX  with 
a  few  differences.  There  are  two  reasons  that  can  explain  this  phenomenon.  First,  there 
is  a  limitation  in  16-bit  binary  number  representation.  Those  fractional  numbers  that  are 
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Table  XVm    Equivalent  decimal  numbers  of  table  (VI) 


4024 

760 

0 

1568 

2040 

-1056 

0 

-152 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

Table  XEX   Equivalent  decimal  numbers  of  table  (XII) 


4088 

760 

0 

1568 

2040 

-1056 

0 

-160 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

smaller  than  2'14  are  truncated.  This  will  cause  small  difference  between  Table  VI, XII 
and  Table  XVIII, XIX.  The  second  reason  is  due  to  the  overflow  situation.  The 
accumulated  sum  of  the  coefficients  might  be  greater  than  the  biggest  number  that  a  16- 
bit  binary  number  system  could  represent.  This  overflow  situation  will  cause  larger 
difference  between  Table  VI,XII  and  Table  XVIII,XIX. 
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A  way  is  found  to  indicate  the  overflow  situation.  Checking  can  be  made  in 
ADDG  and  ADD_I  by  adding  the  following  VHDL  source  code  right  after  the  integer 
to  binary  number  conversion. 

if  ((x,(15)  =  '1'  and  x2(15)  =  'V  and  n,(15)  =  '0')  or 
(x,(15)  =  '0'  and  x2(15)  =  '0'  and  n,(15)  =  '1'))  then 
overl  <  =  '1'; 
if  ((x3(15)  =  '1'  and  x4(15)  =  '1'  and  n2(15)  =  '0')  or 
(x3(15)  =  '0'  and  x4(15)  =  '0'  and  n2(15)  =  '1'))  then 
over2  <  =  '1'; 


if  ((x15(15)  =  '1'  and  xu(15)  =  'V  and  ng(15)  =  '0')  or 
(x15(15)  =  '0'  and  x16(15)  =  '0'  and  n8(15)  =  '1'))  then 
over8  <=  '1'; 

Of  course,  at  the  port  declaration,  a  special  signal  declaration  must  be  made  in  order  to 
notify  the  test  bench  about  this  overflow  condition.  VHDL  source  code  for  the  port 
declaration  is  shown  below. 

port(~;bl,b2,b3,b,b5,b6,b7,b8:  out  bit_vector(15  downto  0); 

overl, over2,over3,over4,over5, over6, over7,over8  :  out  bit; 

CLK  :  bit); 
Addition  to  the  port  modification,  the  test  bench  component's  port  also  needs  to  be 
modified,  the  last  thing  to  accomplish  in  signaling  this  overflow  condition  is  to  declare 
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signals  and  unable  the  "port  map"  to  receive  the  overflow  signal  coming  from  ADDG 
and  ADD  I.  VHDL  source  code  is  shown  below. 


signal  ovl,ov2,ov3,ov4,ov5,ov6,ov7,ov8  :  bit; 


g  :  add_g  port  Inap(fl,f2,f3,f4,f5,f6,n,f8,f9,fl0,fll,n2,fl3,n4,fl5,fl6, 

gl,g2,g3,g4,g5,g6,g7,g8,ck,qo,ovl,ov2,ov3,ov4,ov5,ov6,vo7,vo8); 


i  :  addi  port  map(hl,rl,h2,r2,h3,r3,h4,r4,r4,h5,r5,h6,r6,h7,r7,h8,r8, 
il,i2,i3,i4,i5,i6,i7,i8,ck,ovl,ov2,ov3,ov4,ov5,ov6,ov7,ov8); 


Whenever  the  overflow  bit  "ov#"  changes  to  '  1 ' ,  it  indicates  that  particular  pixel  value 
has  experienced  overflow. 

E.      EXPERIENCE 

My  experience  in  the  work  can  be  listed  as  follows. 

1.       Input  Data  Sequential  Order  error 

The  sequential  order  of  input  pixels  which  are  input  to  the  parallel  shift 
register  was  assumed  to  be  7,  6,  ...0.  According  to  the  transposed  sequence,  the  actual 
input  data  should  be  in  the  order  of  0,  1,  2,... 7.  Therefore,  there  would  be  an  error  if 
the  sequence  of  the  transposed  data  is  not  reversed.  This  means  that  another  reverse 
circuit  should  be  added  between  the  transpose  circuit  and  the  input  "load"  circuit.  But, 
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it  is  rather  complicated  to  add  an  extra  circuit.  The  easiest  way  to  solve  this  problem  is 
to  input  the  data  in  the  order  of  0,  1....7  and  switch  the  subtrahend  connections  (0-7,  1- 
6,  2-5,  3-4)  in  the  2-bit  adder/subtractor  circuit.  In  this  way,  the  order  of  input  data  and 
output  data  are  always  in  the  order  of  0,  1,  2,...  7  and  it  is  not  necessary  to  add  an  extra 
circuit. 

2.  Formation  of  2-bit  Adder  in  VHDL  source  code 

The  interface  of  a  2-bit  adder  has  five  inputs  (two  for  the  adder,  two  for  the 
addend,  and  one  for  the  carrier),  three  outputs  (two  for  the  addition  result,  and  one  for 
the  carrier).  Thus,  a  truth  table  involving  all  possible  input  combinations  can  be  made. 
There  are  five  inputs,  therefore  25  =  32  combinations  will  occur.  After  building  up  an 
8  x  32  truth  table,  Karnaugh  map  reduction  can  be  used  to  minimize  the  complex 
expression  in  boolean  algebra.  It  is  the  boolean  algebra  expression  which  is  used  in  the 
VHDL  source  code.  There  is  a  detailed  example  listed  in  Appendix  E. 

3.  No  Timing  control  in  Addi  Model 

Almost  every  circuit  needs  a  clock  to  trigger  and  control  the  sequential 
process.  ADD_I  is  a  special  adder  circuit  without  a  triggering  clock.  As  mentioned 
earlier,  the  accumulator  of  the  serial  bit  result  consists  of  ADD_I  and  SHI_2.  ADDI  is 
used  to  add  up  the  incoming  clocked  result  with  the  latest  accumulated  result  right  after 
right-shifting  by  2  bits.  If  these  two  circuits  are  triggered  by  the  clock,  then  there  will 
be  a  time  delay  of  one  clock  cycle  between  ADD_I  and  SHI_2.  In  other  words,  ADD_I 
is  adding  the  incoming  clocked  result  with  the  accumulated  right-shifted  2-bit  result  from 
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one  clock  cycle  earlier,  rather  than  the  latest.  This  will  cause  an  error  in  the  output 
coefficients.  The  method  to  remove  of  this  time  delay  of  one  clock  cycle  between  ADD_I 
and  SHI_2  is  to  allow  only  one  clock  to  trigger  this  accumulator.  Another  alternative 
considered  is  to  use  the  clock  to  trigger  ADD_I  instead  of  triggering  SHI_2.  However, 
the  experiment  shows  that  this  cannot  be  done,  since  SHI_2  has  to  be  cleared  on  every 
8th  clock  cycle,  and  this  clearing  needs  a  counter  to  calculate  the  exact  time.  On  the  other 
hand,  SHI_2  is  to  output  the  correct  accumulated  result  every  8*  clock  period.  These  two 
factors  both  need  a  clock  to  control  the  timing.  This  is  why  ADD_I  was  chosen  not  to 
be  triggered  by  the  clock. 

4.       "Set"  control  in  Test  Bench 

It  is  strange  enough  that  the  "set"  control  in  the  test  bench  does  not  get  the 
value  T  at  the  beginning  of  simulation.  The  function  of  "set"  is  to  initiate  all  the 
subtracter's  carriers  in  "adsu"  to  T  in  order  to  accomplish  the  subtraction.  This 
initiation  is  performed  only  once.  The  carrier  of  the  subtractor  is  then  carried  over  all 
by  itself.  That  is  to  say,  the  carrier  is  a  variable  in  "adsu".  This  carry  variable  is  initiated 
by  the  "set"  first  and  will  be  influenced  by  the  "set"  at  subsequent  times  if  modification 
of  the  signal  "set"  is  not  made.  Fortunately,  "set"  has  to  change  only  once  from  '0'  to 
T  at  the  beginning  of  the  simulation.  Therefore,  an  "event"  instruction  causes  "set"  to 
be  a  sensitivity  signal.  Since  "set"  changes  only  once,  it  will  not  have  any  further 
influence  on  the  carrier  variable.  Other  than  this,  the  time  for  "set"  to  change  its  state 
is  very  important,  the  clock  is  '0'  at  the  beginning  of  the  simulation  and  changes  its  state 
to  T  after  5  ns.  If  "set"  changes  its  state  other  than  at  5  ns,  the  subtraction  result  will 
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be  wrong.  Only  when  "set"  changes  its  state  at  5  ns  will  the  result  of  subtraction  be 
correct. 

5.  Signals  cannot  be  used  as  variables  in  VHDL 

In  solving  the  problem  mentioned  in  previous  section,  efforts  have  been  made 
to  use  the  "set"  signal  directly  as  a  variable  within  the  process.  This  certainly  will  yield 
a  syntax  error  doing  compilation  of  the  source  codes. 

6.  Preventing  Negative  Zero  occurrences  in  Packl 

There  is  a  paragraph  of  source  code  added  to  packl  at  the  end  of  "in_to_bi" 
when  negative  zeros  found  during  the  simulation.  When  these  negative  zeros  arrive  at  the 
gate  of  shi_2,  they  will  generate  very  large  negative  numbers  and  cause  an  error  at  the 
output.  This  unwanted  situation  has  been  taken  care  of  by  adding  source  code  to  check 
for  negative  zeros  at  the  end  of  the  integer-binary  conversion  procedure.  Although  this 
extra  checking  source  code  works  fine,  it  means  an  extra  circuit  must  be  added.  This 
is  not  the  goal  in  circuits  design.  A  close  inspection  of  in_to_bi  source  code  has  been 
made  and  a  very  small  mistake  has  been  found.  At  the  beginning  of  inverting  the  bit 
stream  into  2's  complement  codes,  positive  or  negative  integers  is  checked  in  order  to 
assign  the  correct  sign  bit  "w(15)"  for  the  converted  binary  number.  It  is  found  that 
"w(15)  :=  '0'"  is  only  assigned  to  the  situation  when  "m  >  '0'".  The  other  values  are 
all  assigned  with  "w(15)  :=  T".  This  is  how  negative  zeros  are  generated.  Had  the 
source  code  "m  >  '0'"  been  changed  to  "m  >  =  '0'",  the  extra  negative  zero  checking 
codes  would  not  be  necessary. 
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VI.    CONCLUSION 

The  main  objectives  of  this  thesis,  using  the  VHDL  to  describe  a  1-D  DCT 
structural  architecture  of  a  8  x  8  image  block  and  simulating  it  on  a  workstation,  have 
been  reached.  The  basic  theory  of  1-D  DCT,  the  principle  of  distributed  arithmetic  and 
the  actual  hardware  architecture  have  been  made  more  clear  in  the  VHDL  simulation. 
Above  all,  the  experience  of  using  the  VHDL  to  describe  an  algorithm  and  the  simulation 
of  the  VHDL  is  obtained.  Although  getting  familiar  with  the  language  and  its  simulation 
has  been  time-consuming,  the  benefits  of  the  signal  tracing  and  the  time  modeling  have 
been  demonstrated  in  this  thesis.  VHDL  itself  is  a  portable  document  and  a  hierarchical 
language.  Therefore,  this  thesis  can  be  adopted  in  other  more  complicated  design. 

Despite  the  fact  that  the  VHDL  simulation  result  of  integer  point  calculation  is  not 
as  precise  as  floating  point  calculation,  the  resultant  energy  spectrum  of  1-D  DCT  is 
already  good  enough  to  recover  the  original  image  block.  Besides,  absolute  value 
accuracy  is  not  important  for  image  compression.  It  is  the  relative  value  between  pixel 
points  that  matters.  Another  point  worthy  to  mention  is  that  the  approach  in  this  thesis 
has  the  advantages  of  calculation  speed,  since  the  hardware  for  floating  point  calculation 
is  much  more  complicated  than  that  for  integer  point  calculation. 

There  is  still  a  very  important  module  that  was  not  described,  the  transpose 
module.  The  transpose  module  can  be  connected  to  the  test  bench  and  fulfill  the 
automatic  2-D  DCT  simulation. 
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The  simulation  done  here  is  only  the  initial  part  of  the  "top-down  design"  process. 
The  algorithm  of  an  8  x  8  image  block  2-D  DCT  in  VHDL  behavior  description  was 
implemented.  This  behavior  description  can  be  further  developed  into  gate  level 
descriptions.  Once  reached  the  gate  level,  the  hardware  circuit  implementation  can  be 
realized. 
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APPENDIX  A.  12-BIT  1-D  DCT  VHDL  SOURCE  CODES 

Normal  clock  generator 

entity  clockge  is 

port(CLCK  :inout  bit); 
end  clockge; 

architecture  clkctl  of  clockge  is 
begin 

process(CLCK) 
variable  I  :  integer  :  =  0; 
begin 

CLCK  <  =  not  CLCK  after  5  ns; 
I:=  I  +  1; 
assert  I  <  =  80 
report  "job  done" 
severity  Error; 
end  process; 
end  clk_ctl; 

Serial  load  parallel  shift  register 

entity  LOAD  is 

port  (AI :  in  bit_vector(ll  downto  0);  B0,B1,B2,B3,B4,B5,B6,B7 .  out  bit_vector(ll 
downto  0);CLK  :  in  bit); 
end  LOAD; 
architecture  BEH  of  LOAD  is 

type  shift  is  array (0  to  7)of  bit_vector(ll  downto  0); 
begin 
process 

variable  A  :  shift; 
variable  I, count  :  integer  :  =  0; 
begin 
wait  until  CLK'event  and  CLK  =  '1'; 
for  count  in  0  to  7  loop 
wait  until  CLK'event  and  CLK  =  '1'; 
for  I  in  0  to  6  loop 
A(I)  :=  A(I+D; 
end  loop; 
A(7)  :=  AI; 

if  (count  =  7)  and  (CLK'event  and  CLK='l')  then 
B0  <  =  A(7); 
Bl  <  =  A(6); 
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B2  <  =  A(5); 
B3  <  =  A(4)j 
B4  <  =  A(3); 
B5  <  =  A(2); 
B6  <  =  A(l); 
B7  <  =  A(0); 
end  if; 
end  loop; 
wait  on  AI,CLK; 
end  process; 
end  BEH; 

Twice  faster  clock  generator  — 

entity  clock  is 

port(CLK  :inout  bit  :=  '1'); 
end  clock; 

architecture  beh  of  clock  is 
begin 

process(CLK) 
variable  I  :  integer  :=  0; 
begin 

CLK  <  =  not  CLK  after  2.5  ns; 
I:=  I  +  1; 
assert  I  <  =  160 
report  "job  done" 
severity  Error; 
end  process; 
end  beh; 

Delay  gate 

entity  delaylO  is 

port(a  :  bit;b  :  out  bit;CLK  :  bit); 
end  delay  10; 

architecture  beh  of  delaylO  is 
begin 
process 
variable  x  :  bit; 
begin 
wait  until  CLK'event  and  CLK  =  '1'; 
x  :=  a; 
b  <=  x; 
wait  on  CLK, a; 
end  process; 
end  beh; 
Parallel  shift  out  2-bit  register 
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entity  shift  is 
port(bi0,bil,bi2,bi3,bi4,bi5,bi6,bi7:  in  bit_vector(ll  downto  0); 
bo0,bol,bo2,bo3,bo4,bo5,bo6,bo7:  out  bit_vector(l  downto  0); 
CLK  :  in  bit); 
end  shift; 

architecture  beh  of  shift  is 
begin 
process 
variable  I  :  integer  :  =  0; 
begin 
wait  for  90  ns; 
for  r  in  0  to  5  loop 
wait  until  CLK'event  and  CLK  =  '1'; 


bo0(0)  <  = 

bi0(D; 

boO(l)  <  = 

bi0(I+l); 

bol(0)  <  = 

bil(I); 

bol(l)  <  = 

bil(H-l); 

bo2(0)  <  = 

bi2(I); 

bo2(l)  <  = 

bi2(I+l); 

bo3(0)  <  = 

bi3(I); 

bo3(l)  <  = 

bi3(I+l); 

bo4(0)  <  = 

bi4(I); 

bo4(l)  <  = 

bi4(I+l); 

bo5(0)  <  = 

bi5(I); 

bo5(l)  <  = 

bi5(I+l); 

bo6(0)  <  = 

bi6(I); 

bo6(l)  <  = 

bi6(I+l); 

bo7(0)  <  = 

bi7(I); 

bo7(l)  <  = 

bi7(I+l); 

I:=  I  +  2 

end  loop; 

I:=  0; 

wait  on  CLK,bi0,bil,bi2,bi3,bi4,bi5,bi6,bi7; 

end  process; 

end  beh; 

— —   __   _    _   _  ^,-u 
entity  adsu  is 

port(a0,al,a2,a3,a4,a5,a6,a7 : 

bit_vector(l  downto  0); 

b0,bl,b2,b3,b4,b5,b6,b7 : 

out  bit_vector(l  downto  0); 

CLK,cr,st  :  bit); 

end  adsu; 
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architecture  beh  of  adsu  is 
begin 
process 
variable  Cl,c2,c3,c4,c5,c6,c7,c8  :  bit_vector(4  downto  0); 
variable  dl,d2,d3,d4,d5,d6,d7,d8  :  bit_vector(2  downto  0); 
variable  el,e2,e3,e4,e5,e6,e7,e8  :  bit; 
begin 
wait  until  CLK'event  and  CLK  =  T; 
if  cr'event  then 

el  :=  cr;  e2  :=  cr;  e3  :=  cr;  e4  :=  cr; 
end  if; 
if  st'event  then 

e5  :=  st;  e6  :=  st;  e7  :=  st;  e8  :=  st; 
end  if; 


cl(0)  :=  el; 

cl(l)  :=  a0(0); 

cl(2)  :=  a0(l); 

cl(3)  :=  a7(0); 

cl(4)  :=  a7(l); 

dl(0)  :=  (cl(l)  and  (not  cl(3))  and  (not  cl(0))) 
or  (not  cl(l)  and  cl(3)  and  (not  cl(0))) 
or  (not  cl(l)  and  (not  cl(3))  and  cl(0)) 
or  (cl(l)  and  cl(3)  and  cl(0)); 

dl(l)  :=  (not  cl(2)  and  not  cl(l)  and  cl(4)  and  not  cl(0)) 

or  (not  cl(2)  and  cl(4)  and  not  cl(3)  and  not  cl(0)) 
or  (cl(2)  and  not  cl(4)  and  not  cl(3)  and  not  cl(0)) 
or  (cl(2)  and  not  cl(l)  and  not  cl(4)  and  not  cl(0)) 
or  (not  cl(2)  and  cl(l)  and  not  cl(4)  and  cl(3)) 
or  (cl(2)  and  cl(l)  and  cl(3)  and  cl(4)) 
or  (not  cl(l)  and  not  cl(2)  and  cl(4)  and  not  cl(3)) 
or  (not  cl(l)  and  cl(2)  and  not  cl(3)  and  not  cl(4)) 
or  (cl(l)  and  not  cl(2)  and  not  cl(4)  and  cl(0)) 
or  (not  cl(2)  and  not  cl(4)  and  cl(3)  and  cl(0)) 
or  (cl(2)  and  cl(3)  and  cl(4)  and  cl(0)) 
or  (cl(2)  and  cl(l)  and  cl(4)  and  cl(0)); 

dl(2)  :=  (cl(l)  and  cl(2)  and  cl(3)) 

or  (cl(l)  and  cl(3)  and  cl(4)) 
or  (cl(l)  and  cl(2)  and  cl(0)) 
or  (cl(2)  and  cl(3)  and  cl(0)) 
or  (cl(3)  and  cl(4)  and  cl(0)) 
or  (cl(2)  and  cl(4)) 
or  (cl(l)  and  cl(4)  and  cl(0)); 
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bO(0) 

<=  dl(0); 

bO(l)  <=  dl(l); 

el  :=  dl(2); 

c2(0)  :=  e2; 

c2(l)  : 

=  al(0); 

c2(2): 

=  aid); 

c2(3): 

=  a6(0); 

c2(4): 

=  a6(l); 

d2(0) 

:=  (c2(l)  and  (not  c 

or  (not  c2(l)  and  c2(3)  and  (not  c2(0))) 
or  (not  c2(l)  and  (not  c2(3))  and  c2(0)) 
or  (c2(l)  and  c2(3)  and  c2(0)); 

d2(l)  :=  (not  c2(2)  and  not  c2(l)  and  c2(4)  and  not  c2(0)) 

or  (not  c2(2)  and  c2(4)  and  not  c2(3)  and  not  c2(0)) 
or  (c2(2)  and  not  c2(4)  and  not  c2(3)  and  not  c2(0)) 
or  (c2(2)  and  not  c2(l)  and  not  c2(4)  and  not  c2(0V) 
or  (not  c2(2)  and  c2(l)  and  not  c2(4)  and  c2(3)) 
or  (c2(2)  and  c2(l)  and  c2(3)  and  c2(4)) 
or  (not  c2(l)  and  not  c2(2)  and  c2(4)  and  not  c2(3)) 
or  (not  c2(l)  and  c2(2)  and  not  c2(3)  and  not  c2(4)) 
or  (c2(l)  and  not  c2(2)  and  not  c2(4)  and  c2(0)) 
or  (not  c2(2)  and  not  c2(4)  and  c2(3)  and  c2(0)) 
or  (c2(2)  and  c2(3)  and  c2(4)  and  c2(0)) 
or  (c2(2)  and  c2(l)  and  c2(4)  and  c2(0)); 

d2(2)  :=  (c2(l)  and  c2(2)  and  c2(3)) 

or  (c2(l)  and  c2(3)  and  c2(4)) 
or  (c2(l)  and  c2(2)  and  c2(0)) 
or  (c2(2)  and  c2(3)  and  c2(0)) 
or  (c2(3)  and  c2(4)  and  c2(0)) 
or  (c2(2)  and  c2(4)) 
or  (c2(l)  and  c2(4)  and  c2(0)); 

bl(0)  <=  d2(0); 

bid)  <  =  d2(l); 

e2  :=  d2(2); 


c3(0)  :=  e3; 
c3(l)  :=  a2(0); 
c3(2)  :=  a2(l); 
c3(3)  :=  a5(0); 
c3(4)  :=  a5(l); 

d3(0)  :=  (c3(l)  and  (not  c3(3))  and  (not  c3(0))) 
or  (not  c3(l)  and  c3(3)  and  (not  c3(0))) 
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or  (not  c3(l)  and  (not  c3(3))  and  c3(0)) 
or  (c3(l)  and  c3(3)  and  c3(0)); 
63(1)  :=  (not  c3(2)  and  not  c3(l)  and  c3(4)  and  not  c3(0)) 
or  (not  c3(2)  and  c3(4)  and  not  c3(3)  and  not  c3(0)) 
or  (c3(2)  and  not  c3(4)  and  not  c3(3)  and  not  c3(0)) 
or  (c3(2)  and  not  c3(l)  and  not  c3(4)  and  not  c3(0)) 
or  (not  c3(2)  and  c3(l)  and  not  c3(4)  and  c3(3)) 
or  (c3(2)  and  c3(l)  and  c3(3)  and  c3(4)) 
or  (not  c3(l)  and  not  c3(2)  and  c3(4)  and  not  c3(3)) 
or  (not  c3(l)  and  c3(2)  and  not  c3(3)  and  not  c3(4)) 
or  (c3(l)  and  not  c3(2)  and  not  c3(4)  and  c3(0)) 
or  (not  c3(2)  and  not  c3(4)  and  c3(3)  and  c3(0)) 
or  (c3(2)  and  c3(3)  and  c3(4)  and  c3(0)) 
or  (c3(2)  and  c3(l)  and  c3(4)  and  c3(0)); 
d3(2)  :=  (c3(l)  and  c3(2)  and  c3(3)) 
or  (c3(l)  and  c3(3)  and  c3(4)) 
or  (c3(l)  and  c3(2)  and  c3(0)) 
or  (c3(2)  and  c3(3)  and  c3(0)) 
or  (c3(3)  and  c3(4)  and  c3(0)) 
or  (c3(2)  and  c3(4)) 
or  (c3(l)  and  c3(4)  and  c3(0)); 

b2(0)  <  =  d3(0); 

b2(l)  <  =  d3(l); 

e3  :=  d3(2); 


c4(0)  :=  e4; 

c4(l)  :=  a3(0); 

c4(2)  :=  a3(l); 

c4(3)  :=  a4(0); 

c4(4)  :=  a4(l)* 

d4(0)  :=  (c4(l)  and  (not  c4(3))  and  (not  c4(0))) 
or  (not  c4(l)  and  c4(3)  and  (not  c4(0))) 
or  (not  c4(l)  and  (not  c4(3))  and  c4(0)) 
or  (c4(l)  and  c4(3)  and  c4(0)); 

d4(l)  :=  (not  c4(2)  and  not  c4(l)  and  c4(4)  and  not  c4(0)) 
or  (not  c4(2)  and  c4(4)  and  not  c4(3)  and  not  c4(0)) 
or  (c4(2)  and  not  c4(4)  and  not  c4(3)  and  not  c4(0)) 
or  (c4(2)  and  not  c4(l)  and  not  c4(4)  and  not  c4(0)) 
or  (not  c4(2)  and  c4(l)  and  not  c4(4)  and  c4(3)) 
or  (c4(2)  and  c4(l)  and  c4(3)  and  c4(4)) 
or  (not  c4(l)  and  not  c4(2)  and  c4(4)  and  not  c4(3)) 
or  (not  c4(l)  and  c4(2)  and  not  c4(3)  and  not  c4(4)) 
or  (c4(l)  and  not  c4(2)  and  not  c4(4)  and  c4(0)) 
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or  (not  c4(2)  and  not  c4(4)  and  c4(3)  and  c4(0)) 

or  (c4(2)  and  c4(3)  and  c4(4)  and  c4(0)) 

or  (c4(2)  and  c4(l)  and  c4(4)  and  c4(0)); 
d4(2)  :=  (c4(l)  and  c4(2)  and  c4(3)) 

or  (c4(l)  and  c4(3)  and  c4(4)) 

or  (c4(l)  and  c4(2)  and  c4(0)) 

or  (c4(2)  and  c4(3)  and  c4(0)) 

or  (c4(3)  and  c4(4)  and  c4(0)) 

or  (c4(2)  and  c4(4)) 

or  (c4(l)  and  c4(4)  and  c4(0)); 
b3(0)  <  =  d4(0); 
b3(l)  <  =  d4(l); 
e4  :=  d4(2); 


c5(0)  :=  e5; 

c5(l)  :=  a3(0); 

c5(2)  :=  a3(l); 

c5(3)  :=  not  a4(0); 

c5(4)  :=  not  a4(l); 

65(0)  :=  (c5(l)  and  (not  c5(3))  and  (not  c5(0))) 
or  (not  c5(l)  and  c5(3)  and  (not  c5(0))) 
or  (not  c5(l)  and  (not  c5(3))  and  c5(0)) 
or  (c5(l)  and  c5(3)  and  c5(0)); 

65(1)  :=  (not  c5(2)  and  not  c5(l)  and  c5(4)  and  not  c5(0)) 
or  (not  c5(2)  and  c5(4)  and  not  c5(3)  and  not  c5(0)) 
or  (c5(2)  and  not  c5(4)  and  not  c5(3)  and  not  c5(0)) 
or  (c5(2)  and  not  c5(l)  and  not  c5(4)  and  not  c5(0)) 
or  (not  c5(2)  and  c5(l)  and  not  c5(4)  and  c5(3)) 
or  (c5(2)  and  c5(l)  and  c5(3)  and  c5(4)) 
or  (not  c5(l)  and  not  c5(2)  and  c5(4)  and  not  c5(3)) 
or  (not  c5(l)  and  c5(2)  and  not  c5(3)  and  not  c5(4)) 
or  (c5(l)  and  not  c5(2)  and  not  c5(4)  and  c5(0)) 
or  (not  c5(2)  and  not  c5(4)  and  c5(3)  and  c5(0)) 
or  (c5(2)  and  c5(3)  and  c5(4)  and  c5(0)) 
or  (c5(2)  and  c5(l)  and  c5(4)  and  c5(0)); 

d5(2)  :=  (c5(l)  and  c5(2)  and  c5(3)) 
or  (c5(l)  and  c5(3)  and  c5(4)) 
or  (c5(l)  and  c5(2)  and  c5(0)) 
or  (c5(2)  and  c5(3)  and  c5(0)) 
or  (c5(3)  and  c5(4)  and  c5(0)) 
or  (c5(2)  and  c5(4)) 
or  (c5(l)  and  c5(4)  and  c5(0)); 

b4(0)  <  =  d5(0); 
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b4(l)  <  =  d5(l); 
e5  :=  d5(2); 


c6(0)  :=  e6; 

c6(l)  :=  a2(0); 

c6(2)  :=  a2(l); 

c6(3)  :=  not  a5(0); 

c6(4)  :=  not  a5(l); 

d6(0)  :=  (c6(l)  and  (not  c6(3))  and  (not  c6(0))) 
or  (not  c6H)  and  c6(3)  and  (not  c6(0))) 
or  (not  c6u)  and  (not  c6(3))  and  c6(0)) 
or  (c6(l)  and  c6(3)  and  c6(0)); 

d6(l)  :  =  (not  c6(2)  and  not  c6(l)  and  c6(4)  and  not  c6(0)) 
or  (not  c6(2)  and  c6(4)  and  not  c6(3)  and  not  c6(0)) 
or  (c6(2)  and  not  c6(4)  and  not  c6(3)  and  not  c6(0)) 
or  (c6(2)  and  not  c6(l)  and  not  c6(4)  and  not  c6(0)) 
or  (not  c6(2)  and  c6(l)  and  not  c6(4)  and  c6(3)) 
or  (c6(2)  and  c6(l)  and  c6(3)  and  c6(4)) 
or  (not  c6(l)  and  not  c6(2)  and  c6(4)  and  not  c6(3)) 
or  (not  c6(l)  and  c6(2)  and  not  c6(3)  and  not  c6(4)) 
or  (c6(l)  and  not  c6(2)  and  not  c6(4)  and  c6(0)) 
or  (not  c6(2)  and  not  c6(4)  and  c6(3)  and  c6(0)) 
or  (c6(2)  and  c6(3)  and  c6(4)  and  c6(0)) 
or  (c6(2)  and  c6(l)  and  c6(4)  and  c6(0)); 

d6(2)  :=  (c6(l)  and  c6(2)  and  c6(3)) 
or  (c6(l)  and  c6(3)  and  c6(4)) 
or  (c6(l)  and  c6(2)  and  c6(0)) 
or  (c6(2)  and  c6(3)  and  c6(0)) 
or  (c6(3)  and  c6(4)  and  c6(0)) 
or  (c6(2)  and  c6(4» 
or  (c6(l)  and  c6(4)  and  c6(0)); 

b5(0)  <  =  d6(0); 

b5(l)  <  =  d6(l); 

e6  :=  d6(2); 


c7(0)  :=  e7; 

c7(l)  :=  al(0); 

c7(2)  :=  al(l); 

c7(3)  :=  nota6(0); 

c7(4)  :=  nota6(l); 

d7(0)  :=  (c7(l)  and  (not  c7(3))  and  (not  c7(0))) 
or  (not  c7(l)  and  c7(3)  and  (not  c7(0))) 
or  (not  c7(l)  and  (not  c7(3))  and  c7(0)) 
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or  (c7(l)  and  c7(3)  and  c7(0)); 

d7(l)  :=  (not  c7(2)  and  not  c7(l)  and  c7(4)  and  not  c7(0)) 
or  (not  c7(2)  and  c7(4)  and  not  c7(3)  and  not  c7(0)) 
or  (c7(2)  and  not  c7(4)  and  not  c7(3)  and  not  c7(0)) 
or  (c7(2)  and  not  c7(l)  and  not  c7(4)  and  not  c7(0)) 
or  (not  c7(2)  and  c7(l)  and  not  c7(4)  and  c7(3)) 
or  (c7(2)  and  c7(l)  and  c7(3)  and  c7(4)) 
or  (not  c7(l)  and  not  c7(2)  and  c7(4)  and  not  c7(3)) 
or  (not  c7(l)  and  c7(2)  and  not  c7(3)  and  not  c7(4)) 
or  (c7(l)  and  not  c7(2)  and  not  c7(4)  and  c7(0)) 
or  (not  c7(2)  and  not  c7(4)  and  c7(3)  and  c7(0)) 
or  (c7(2)  and  c7(3)  and  c7(4)  and  c7(0)) 
or  (c7(2)  and  c7(l)  and  c7(4)  and  c7(0)); 

d7(2)  :=  (c7(l)  and  c7(2)  and  c7(3)) 
or  (c7(l)  and  c7(3)  and  c7(4)) 
or  (c7(l)  and  c7(2)  and  c7(0)) 
or  (c7(2)  and  c7(3)  and  c7(0)) 
or  (c7(3)  and  c7(4)  and  c7(0)) 
or  (c7(2)  and  c7(4)) 
or  (c7(l)  and  c7(4)  and  c7(0)); 

b6(0)  <  =  d7(0); 

b6(l)  <  =  d7(l); 

e7:=  d7(2); 

c8(0)  :=  e8; 

c8(l)  :=  a0(0); 

c8(2)  :=  aO(l); 

c8(3)  :=  nota7(0); 

c8(4)  :=  not  a7(l); 

d8(0)  :=  (c8(l)  and  (not  c8(3))  and  (not  c8(0))) 
or  (not  c8(l)  and  c8(3)  and  (not  c8(0))) 
or  (not  c8(l)  and  (not  c8(3))  and  c8(0)) 
or  (c8(l)  and  c8(3)  and  c8(0)); 

d8(l)  :=  (not  c8(2)  and  not  c8(l)  and  c8(4)  and  not  c8(0)) 
or  (not  c8(2)  and  c8(4)  and  not  c8(3)  and  not  c8(0)) 
or  (c8(2)  and  not  c8(4)  and  not  c8(3)  and  not  c8(0)) 
or  (c8(2)  and  not  c8(l)  and  not  c8(4)  and  not  c8(0)) 
or  (not  c8(2)  and  c8(l)  and  not  c8(4)  and  c8(3)) 
or  (c8(2)  and  c8(l)  and  c8(3)  and  c8(4)) 
or  (not  c8(l)  and  not  c8(2)  and  c8(4)  and  not  c8(3)) 
or  (not  c8(l)  and  c8(2)  and  not  c8(3)  and  not  c8(4)) 
or  (c8(l)  and  not  c8(2)  and  not  c8(4)  and  c8(0)) 
or  (not  c8(2)  and  not  c8(4)  and  c8(3)  and  c8(0)) 
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or  (c8(2)  and  c8(3)  and  c8(4)  and  c8(0)) 
or  (c8(2)  and  c8(l)  and  c8(4)  and  c8(0)); 
d8(2)  :=  (c8(l)  and  c8(2)  and  c8(3)) 
or  (c8(l)  and  c8(3)  and  c8(4)) 
or  (c8(l)  and  c8(2)  and  c8(0)) 
or  (c8(2)  and  c8(3)  and  c8(0)) 
or  (c8(3)  and  c8(4)  and  c8(0)) 
or  (c8(2)  and  c8(4)) 
or  (c8(l)  and  c8(4)  and  c8(0)); 
b7(0)  <  =  d8(0); 
b7(l)  <  =  d8(l); 
e8  :=  d8(2); 

wait  on  a0,al,a2,a3,a4,a5,a6,a7,CLK,cr,st; 
end  process; 
end  beh; 

Register 

entity  reg  is 
port(a0,al,a2,a3,a4,a5,a6,a7  :  bit_vector(l  downto  0); 
D0,bl,b2,b3,b4,b5,b6,b7 :  out  bit  vectord  downto  0); 
CLK  :  bit); 
end  reg; 

architecture  beh  of  reg  is 
begin 
process 

variable  dO.dl  .&2,d3.d4,d5.d6.d7  :  bitvectord  downto  0); 
begin 


dO 

:=  aO; 

dl 

:=al; 

d2 

:=  a2; 

d3  : 

:=  a3; 

d4 

=  a4; 

d5: 

:=  a5; 

d6 

:=  a6; 

d7. 

:=a7; 

wait  until  ( 

bO  <  =  dO 

bl  <  =  dl 

b2  <=  d2 

b3  <=  d3 

b4  <  =  d4 

b5  <  =  d5 

b6  <  =  d6 

b7 

<=  d7 

'l'; 
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wait  on  CLK; 

end  process; 
end  beh; 
ROM 

entity  rom  is 
port(e0,el,e2,e3,e4,e5,e6,e7  :  bit_vector(l  downto  0); 

bl0,bll,b20,b21,b30,b31,b40,b41,b50,b51,b60,b61,b70,b71,b80,b81 
out  bit_vector(15  downto  0); 
CLK  :  bit); 
end  rom; 

architecture  beh  of  rom  is 
begin 
process 
variable  al0,al  1  ,a20,a21  ,a30,a31  ,a40,a41  ,a50,a51  ,a60,a61  ,a70,a71 , 

a80,a81  :  bit_vector(3  downto  0); 
begin 
wait  until  CLK'event  and  CLK  =  '1'; 


al0(3)  : 

=  e0(0);  al0(2) 

all(3)  : 

=  eO(l);  all(2) 

a20(3)  : 

=  e7(0);  a20(2) 

a21(3)  : 

=  e7(l);  a21(2) 

a30(3)  : 

=  e0(0);  a30(2) 

a31(3)  : 

=  e0(l);  a31(2) 

a40(3)  : 

=  e7(0);  a40(2) 

a41(3)  : 

=  e7(l);  a41(2) 

a50(3)  : 

:=  e0(0);a50(2) 

a51(3)  : 

:=  e0(l);a51(2) 

a60(3) 

:=  e7(0);a60(2) 

a61(3) 

:=  e7(l);a61(2) 

a70(3) 

:=  e0(0);a70(2) 

a71(3) 

:=  e0(l);a71(2) 

a80(3) 

:=  e7(0);a80(2) 

a81(3) 

:=  e7(l);a81(2) 

case  alO  is 

when  "0000"  =  >  blO  < 

when  "0001"  =  >  blO  < 

when  "0010"  =  >  blO  < 

when  "0011"  =  >  blO  < 

when  "0100"  =  >  blO  < 

when  "0101"  =  >  blO  < 

when  "0110"  =  >  blO  < 

when  "01 

Lll"  =  >  blO  < 

:=  el(0): 

al0(l)  : 

=  e2(0): 

al0(0)  : 

=  e3(0); 

:=  el(l); 

all(l)  : 

=  e2(l): 

all(0)  : 

=  e3(l); 

:=  e6(0): 

a20(l)  : 

=  e5(0): 

a20(0)  : 

:=  e4(0); 

:=  e6(l): 

a21(l)  : 

=  e5(l): 

a21(0)  : 

=  e4(l); 

:=  el(0): 

a30(l)  : 

=  e2(0)« 

a30(0)  : 

=  e3(0); 

:=  el(l): 

a31(l)  : 

:=  e2(l): 

a31(0)  : 

=  e3(l); 

:=  e6(0): 

a40(l) 

:=  e5(0) 

a40(0)  : 

:=  e4(0); 

:=  e6(l): 

!  a41(l) 

:=  e5(l) 

a41(0)  - 

=  e4(l); 

:=  el(0) 

;  a50(l)  : 

:=  e2(0). 

a50(0)  : 

=  e3(0); 

:=  el(l) 

;  a51(l) 

:=  e2(l) 

;  a51(0)  : 

=  e3(l); 

:=  e6(0) 

;  a60(l) 

:=  e5(0) 

;  a60(0)  : 

:=  e4(0); 

:=  e6(l) 

;  a61(l) 

:=  e5(l) 

;  a61(0) 

:=  e4(l); 

:=  el(0) 

;  a70(l) 

:=  e2(0) 

;  a70(0) 

:=  e3(0); 

:=  el(l) 

;  a71(l) 

:=  e2(l) 

;  a71(0) 

:=  e3(l); 

:=  e6(0) 

;  a80(l ) 

:=  e5(0) 

;  a80(0) 

:=  e4(0); 

:=  e6(l) 

;  a81(l) 

:=  e5(l) 

;  a81(0) 

:=  e4(l); 

"0000000000000000"; 
"0001011010100000"; 
"0001011010100000"; 
"0010110101000001"; 
"0001011010100000"; 
"0010110101000001"; 
"0010110101000001"; 
"0100001111100001"; 
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when  "1000"  =  >  blO  <  =  "0001011010100000" 
when  "1001"  =  >  blO  <  =  "0010110101000001" 
when  "1010"  =  >  blO  <  =  "0010110101000001" 
when  "1011"  =  >  blO  <  =  "0100001111100001" 
when  "1100"  =  >  blO  <  =  "0010110101000001" 
when  "1101"  =  >  blO  <  =  "0100001111100001" 
when  "1110"  =  >  blO  <  =  "0100001111100001" 
when  "1111"  =>  blO  <=  "0101101010000010" 
end  case; 


case  all  is 

when 

"0000" 

=  > 

bll 

<  = 

"0000000000000000"; 

when 

"0001" 

=  > 

bll 

<  = 

"0001011010100000"; 

when 

"0010" 

=  > 

bll 

<  = 

"0001011010100000"; 

when 

"0011" 

=  > 

bll 

<  = 

"0010110101000001"; 

when 

"0100" 

=  > 

bll 

<  = 

"0001011010100000"; 

when 

"0101" 

=  > 

bll 

<  = 

"0010110101000001"; 

when 

"0110" 

=  > 

bll 

<  = 

"0010110101000001"; 

when 

"0111" 

=  > 

bll 

<  = 

"0100001111100001"; 

when 

"1000" 

=  > 

bll 

<  = 

"0001011010100000"; 

when 

"1001" 

=  > 

bll 

<  = 

"0010110101000001"; 

when 

"1010" 

=  > 

bll 

<  = 

"0010110101000001"; 

when 

"1011" 

=  > 

bll 

<  = 

"0100001111100001"; 

when 

"1100" 

=  > 

bll 

<  = 

"0010110101000001"; 

when 

"1101" 

=  > 

bll 

<  = 

"0100001111100001"; 

when 

"1110" 

=  > 

bll 

<  = 

"0100001111100001"; 

when 

"1111" 

=  > 

bll 

<  = 

"0101101010000010"; 

end  case; 

case  a20  is 

when  "0000"  =  >  b20  <  =  "0000000000000000" 

when  "0001"  =  >  b20  <  =  "0000011000111110" 

when  "0010"  =  >  b20  <  =  "0001000111000111" 

when  "0011"  =  >  b20  <  =  "0001100000000101" 

when  "0100"  =  >  b20  <  =  "0001101010011011" 

when  "0101"  =  >  b20  <  =  "0010000011011001" 

when  "0110"  =  >  b20  <  =  "0010110001100010" 

when  "0111"  =  >  b20  <  =  "0011001010100000" 

when  "1000"  =  >  b20  <  =  "0001111101100010" 

when  "1001"  =  >  b20  <  =  "0010010110100000" 

when  "1010"  =  >  b20  <  =  "0011000100101001" 

when  "1011"  =  >  b20  <  =  "0011011101101000" 

when  "1100"  =  >  b20  <  =  "0011100111111101" 
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when  "1101" 

=  > 

b20 

<  = 

"0100000000111100"; 

when  "1110" 

=  > 

b20 

<  = 

"0100101111000101"; 

when  "1111" 

=  > 

b20 

<  = 

"0101001000000011"; 

end  case; 

case  a21  is 

when  "0000" 

=  > 

b21 

<  = 

"0000000000000000"; 

when  "0001" 

=  > 

b21 

<  = 

"0000011000111110"; 

when  "0010" 

=  > 

b21 

<  = 

"0001000111000111"; 

when  "0011" 

=  > 

b21 

<  = 

"0001100000000101"; 

when  "0100" 

=  > 

b21 

<  = 

"0001101010011011"; 

when  "0101" 

=  > 

b21 

<  = 

"0010000011011001"; 

when  "0110" 

=  > 

b21 

<  = 

"0010110001100010"; 

when  "0111" 

=  > 

b21 

<  = 

"0011001010100000"; 

when  "1000" 

=  > 

b21 

<  = 

"0001111101100010"; 

when  "1001" 

=  > 

b21 

<  = 

"0010010110100000"; 

when  "1010" 

=  > 

b21 

<  = 

"0011000100101001"; 

when  "1011" 

=  > 

b21 

<  = 

"0011011101101000"; 

when  "1100" 

=  > 

b21 

<  = 

"0011100111111101"; 

when  "1101" 

=  > 

b21 

<  = 

"0100000000111100"; 

when  "1110" 

=  > 

b21 

<  = 

"0100101111000101"; 

when  "1111" 

=  > 

b21 

<  = 

"0101001000000011"; 

end  case; 

case  a30  is 

when  "0000" 

=  > 

b30 

<  = 

"0000000000000000"; 

when  "0001" 

=  > 

b30 

<  = 

"1110001001110000"; 

when  "0010" 

=  > 

b30 

<  = 

"1111001111000010"; 

when  "0011" 

=  > 

b30 

<  = 

"1101011000110001"; 

when  "0100" 

=  > 

b30 

<  = 

"0000110000111110"; 

when  "0101" 

=  > 

b30 

<  = 

"1110111010101111"; 

when  "0110" 

=  > 

b30 

<  = 

"0000000000000000"; 

when  "0111" 

=  > 

b30 

<  = 

"1110001001110000"; 

when  "1000" 

=  > 

b30 

<  = 

"0001110110010000"; 

when  "1001" 

=  > 

b30 

<  = 

"0000000000000000"; 

when  "1010" 

=  > 

b30 

<  = 

"0001000101010001"; 

when  "1011" 

=  > 

b30 

<  = 

"1111001111000010"; 

when  "1100" 

=  > 

b30 

<  = 

"0010100111001111"; 

when  "1101" 

=  > 

b30 

<  = 

"0000110000111110"; 

when  "1110" 

=  > 

b30 

<  = 

"0001110110010000"; 

when  "1111" 

=  > 

b30 

<  = 

"0000000000000000"; 

end  case; 
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case  a31  is 


when  "0000" 
when  "0001" 
when  "0010" 
when  "0011" 
when  "0100" 
when  "0101" 
when  "0110" 
when  "0111" 
when  "1000" 
when  "1001" 
when  "1010" 
when  "1011" 
when  "1100" 
when  "1101" 
when  "1110" 
when  "1111" 
end  case; 


>  b31  < 

>  b31  < 

>  b31  < 

>  b31  < 

>  b31  < 

>  b31  < 

>  b31  < 

>  b31  < 

>  b31  < 

>  b31  < 

>  b31  < 

>  b31  < 

>  b31  < 

>  b31  < 

>  b31  < 

>  b31  < 


0000000000000000" 
1110001001110000" 
1111001111000010" 
1101011000110001" 
0000110000111110" 
1110111010101111" 
0000000000000000" 
1110001001110000" 
0001110110010000" 
0000000000000000" 
0001000101010001" 
1111001111000010" 
0010100111001111" 
0000110000111110" 
0001110110010000" 
0000000000000000" 


case  a40  is 
when  "0000" 
when  "0001" 
when  "0010" 
when  "0011" 
when  "0100" 
when  "0101" 
when  "0110" 
when  "0111" 
when  "1000" 
when  "1001" 
when  "1010" 
when  "1011" 
when  "1100" 
when  "1101" 
when  "1110" 
when  "1111" 
end  case; 


>  b40  <  =  "0000000000000000" 

>  b40  <  =  "1110111000111001" 

>  b40  <  =  "1110000010011110" 

>  b40  <  =  "1100111011010111" 

>  b40  <  =  "1111100111000010" 

>  b40  <=  "1110011111111011" 

>  b40  <  =  "1101101001100000" 

>  b40  <  =  "1100100010011000" 

>  b40  <  =  "0001101010011011" 

>  b40  <  =  "0000100011010100" 

>  b40  <  =  "1111101100111001" 

>  b40  <  =  "1110100101110010" 

>  b40  <  =  "0001010001011101" 

>  b40  <  =  "0000001010010101" 

>  b40  <  =  "1111010011111011" 

>  b40  <  =  "1110001100110100" 


case  a41  is 
when  "0000" 
when  "0001" 
when  "0010" 


>  b41  <  =  "0000000000000000"; 

>  b41  <=  "1110111000111001"; 

>  b41  <=  "1110000010011110"; 
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when  "0011"  =  >  b41  <  =  "1100111011010111"; 
when  "0100"  =  >  b41  <  =  "1111100111000010"; 
when  "0101"  =  >  b41  <  =  "1110011111111011"; 
when  "0110"  =  >  b41  <  =  "1101101001100000"; 
when  "0111"  =  >  b41  <  =  "1100100010011000"; 
when  "1000"  =  >  b41  <  =  "0001101010011011"; 
when  "1001"  =  >  b41  <  =  "0000100011010100"; 
when  "1010"  =  >  b41  <  =  "1111101100111001"; 
when  "1011"  =  >  b41  <  =  "1110100101110010"; 
when  "1100"  =  >  b41  <  =  "0001010001011101"; 
when  "1101"  =  >  b41  <  =  "0000001010010101"; 
when  "1110"  =  >  b41  <  =  "1111010011111011"; 
when  "1111"  =>  b41  <=  "1110001100110100"; 
end  case; 

case  a50  is 

when  "0000"  =  >  b50  <  =  "0000000000000000"; 
when  "0001"  =  >  b50  <  =  "0001011010100000"; 
when  "0010"  =  >  b50  <  =  "1110100101100000"; 
when  "0011"  =  >  b50  <  =  "0000000000000000"; 
when  "0100"  =  >  b50  <  =  "1110100101100000"; 
when  "0101"  =  >  b50  <  =  "0000000000000000"; 
when  "0110"  =  >  b50  <  =  "1101001010111111"; 
when  "0111"  =  >  b50  <  =  "1110100101100000"; 
when  "1000"  =  >  b50  <  =  "0001011010100000"; 
when  "1001"  =  >  b50  <  =  "0010110101000001"; 
when  "1010"  =  >  b50  <  =  "0000000000000000"; 
When  "1011"  =  >  b50  <  =  "0001011010100000"; 
When  "1100"  =  >  b50  <  =  "0000000000000000"; 
When  "1101"  =  >  b50  <  =  "0001011010100000"; 
When  "1110"  =  >  b50  <  =  "1110100101100000"; 
When  "1111"  =>  b50  <=  "0000000000000000"; 
end  case; 

case  a51  is 

when  "0000"  =  >  b51  <  =  "0000000000000000"; 

when  "0001"  =  >  b51  <  =  "0001011010100000"; 

when  "0010"  =  >  b51  <  =  "1110100101100000"; 

when  "0011"  =  >  b51  <  =  "0000000000000000"; 

when  "0100"  =  >  b51  <  =  "1110100101100000"; 

when  "0101"  =  >  b51  <  =  "0000000000000000"; 

when  "0110"  =  >  b51  <  =  "1101001010111111"; 

when  "0111"  =  >  b51  <  =  "1110100101100000"; 
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when  "1000"  =  >  b51  <  =  "0001011010100000"; 
when  "1001"  =  >  b51  <  =  "0010110101000001"; 
when  "1010"  =  >  b51  <  =  "0000000000000000"; 
When  "1011"  =  >  b51  <  =  "0001011010100000"; 
When  "1100"  =  >  b51  <  =  "0000000000000000"; 
When  "1101"  =  >  b51  <  =  "0001011010100000"; 
When  "1110"  =  >  b51  <  =  "1110100101100000"; 
When  "1111"  =>  b51  <=  "0000000000000000"; 
end  case; 


case  a60  is 

when  "0000"  =  >  b60  <  =  "0000000000000000"; 
when  "0001"  =  >  b60  <  =  "0001101010011011"; 
when  "0010"  =  >  b60  <  =  "0000011000111110"; 
when  "0011"  =  >  b60  <  =  "0010000011011001"; 
when  "0100"  =  >  b60  <  =  "1110000010011110"; 
when  "0101"  =  >  b60  <  =  "1111101100111001"; 
when  "0110"  =  >  b60  <  =  "1110011011011100"; 
when  "0111"  =  >  b60  <  =  "0000000101110110"; 
when  "1000"  =  >  b60  <  =  "0001000111000111"; 
when  "1001"  =  >  b60  <  =  "0010110001100010"; 
when  "1010"  =  >  b60  <  =  "0001100000000101"; 
When  "1011"  =  >  b60  <  =  "0011001010100000"; 
When  "1100"  =  >  b60  <  =  "1111001001100101"; 
When  "1101"  =  >  b60  <  =  "0000110100000000"; 
When  "1110"  =  >  b60  <  =  "1111100010100011"; 
When  "1111"  =  >  b60  <  =  "0001001100111110"; 
end  case; 


case  a61  is 

when  "0000"  =  >  b61  <  =  "0000000000000000" 

when  "0001"  =  >  b61  <  =  "0001101010011011" 

when  "0010"  =  >  b61  <  =  "0000011000111110" 

when  "0011"  =  >  b61  <  =  "0010000011011001" 

when  "0100"  =  >  b61  <  =  "1110000010011110" 

when  "0101"  =  >  b61  <  =  "1111101100111001" 

when  "0110"  =  >  b61  <  =  "1110011011011100" 

when  "0111"  =  >  b61  <  =  "0000000101110110" 

when  "1000"  =  >  b61  <  =  "0001000111000111" 

when  "1001"  =  >  b61  <  =  "0010110001100010" 

when  "1010"  =  >  b61  <  =  "0001100000000101" 

When  "1011"  =  >  b61  <  =  "0011001010100000"; 

When  "1100"  =  >  b61  <  =  "1111001001100101"; 
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When  "HOI"  =  >  b61  <  =  "0000110100000000"; 

When  "1110"  =  >  b61  <  =  "1111100010100011"; 

When  "1111"  =  >  b61  <  =  "0001001100111110"; 
end  case; 

case  a70  is 

when  "0000"  =  >  b70  <  =  "0000000000000000"; 
when  "0001"  =  >  b70  <  =  "1111001111000010"; 
when  "0010"  =  >  b70  <  =  "0001110110010000"; 
when  "0011"  =  >  b70  <  =  "0001000101010001"; 
when  "0100"  =  >  b70  <  =  "1110001001110000"; 
when  "0101"  =  >  b70  <  =  "1101011000110001"; 
when  "0110"  =  >  b70  <  =  "0000000000000000"; 
when  "0111"  =  >  b70  <  =  "1111001111000010"; 
when  "1000"  =  >  b70  <  =  "0000110000111110"; 
when  "1001"  =  >  b70  <  =  "0000000000000000"; 
when  "1010"  =  >  b70  <  =  "0010100111001111"; 
When  "1011"  =  >  b70  <  =  "0001110110010000"; 
When  "1100"  =  >  b70  <  =  "1110111010101111"; 
When  "1101"  =  >  b70  <  =  "1110001001110000"; 
When  "1110"  =  >  b70  <  =  "0000110000111110"; 
When  "1111"  =  >  b70  <  =  "0000000000000000"; 
end  case; 

case  a71  is 

when  "0000"  =  >  b71  <  =  "0000000000000000"; 
when  "0001"  =  >  b71  <  =  "1111001111000010"; 
when  "0010"  =  >  b71  <  =  "0001110110010000"; 
when  "0011"  =  >  b71  <  =  "0001000101010001"; 
when  "0100"  =  >  b71  <  =  "1110001001110000"; 
when  "0101"  =  >  b71  <  =  "1101011000110001"; 
when  "0110"  =  >  b71  <  =  "0000000000000000"; 
when  "0111"  =  >  b71  <  =  "1111001111000010"; 
when  "1000"  =  >  b71  <  =  "0000110000111110"; 
when  "1001"  =  >  b71  <  =  "0000000000000000"; 
when  "1010"  =  >  b71  <  =  "0010100111001111"; 
When  "1011"  =  >  b71  <  =  "0001110110010000"; 
When  "1100"  =  >  b71  <  =  "1110111010101111"; 
When  "1101"  =  >  b71  <  =  "1110001001110000"; 
When  "1110"  =  >  b71  <  =  "0000110000111110"; 
When  "1111"  =  >  b71  <  =  "0000000000000000"; 
end  case; 
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case  a80  is 

when  "0000"  =  >  b80  <  =  "0000000000000000"; 
when  "0001"  =  >  b80  <  =  "1110000010011110"; 
when  "0010"  =  >  b80  <  =  "0001101010011011"; 
when  "0011"  =  >  b80  <  =  "1111101100111001"; 
when  "0100"  =  >  b80  <  =  "1110111000111001"; 
when  "0101"  =  >  b80  <  =  "1100111011010111"; 
when  "0110"  =  >  b80  <  =  "0000100011010100"; 
when  "0111"  =  >  b80  <  =  "1110100101110010"; 
when  "1000"  =  >  b80  <  =  "0000011000111110"; 
when  "1001"  =  >  b80  <  =  "1110011011011100"; 
when  "1010"  =  >  b80  <  =  "0010000011011001"; 
When  "1011"  =  >  b80  <  =  "0000000101110110"; 
When  "1100"  =  >  b80  <  =  "1111010001110111"; 
When  "1101"  =  >  b80  <  =  "1101010100010101"; 
When  "1110"  =  >  b80  <  =  "0000111100010010"; 
When  "1111"  =  >  b80  <  =  "1110111110110000"; 
end  case; 

case  a81  is 

when  "0000"  =  >  b81  <  =  "0000000000000000"; 

when  "0001"  =  >  b81  <  =  "1110000010011110"; 

when  "0010"  =  >  b81  <  =  "0001101010011011"; 

when  "0011"  =  >  b81  <  =  "1111101100111001"; 

when  "0100"  =  >  b81  <  =  "1110111000111001"; 

when  "0101"  =  >  b81  <  =  "1100111011010111"; 

when  "0110"  =  >  b81  <  =  "0000100011010100"; 

when  "0111"  =  >  b81  <  =  "1110100101110010"; 

when  "1000"  =  >  b81  <  =  "0000011000111110"; 

when  "1001"  =  >  b81  <  =  "1110011011011100"; 

when  "1010"  =  >  b81  <  =  "0010000011011001"; 

When  "1011"  =  >  b81  <  =  "0000000101110110"; 

When  "1100"  =  >  b81  <  =  "1111010001110111"; 

When  "1101"  =  >  b81  <  =  "1101010100010101"; 

When  "1110"  =  >  b81  <  =  "0000111100010010"; 

When  "1111"  =>  b81  <=  "1110111110110000"; 

end  case; 

wait  on  e0,el,e2,e3,e4,e5,e6,e7,CLK; 
end  process; 
end  beh; 

Shift  right  1-bit  register 

entity  shi_l  is 
Port(n,f2,f3,f4,f5,f6,n,f8,f9,fl0,m,n2,n3,fl4,n5,n6: 
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bit  vector(15  downto  0); 

bl0,bll,b20,b21,b30,b31,b40,b41,b50,b51,b60,b61,b70,b71,b80,b81: 

out  bit  v 

ector(15  downto  0); 

CLK  :  bit); 

end  shi_l; 

architecture  beh  of  shil  is 

begin 

process 

variable  al,a2,a3,a4,a5,a6,a7,a8  :  bit_vector(15  downto  0); 

begin 

wait  until  CLK'event  and  CLK  =  '1'; 

if  fl(15)  ='0' then 

al(15)  : 

=  '0'; 

else 

al(15)  : 

—  ji'« 

end  if; 

al(14)  : 

=  fl(15): 

al(13)  : 

=  fl(14); 

al(12)  : 

.=  fl(13) 

al(ll)  , 

:=  fl(12) 

al(10) 

:=  fl(ll) 

al(9)    : 

=  fl(10); 

al(8)    : 

=  fl(9); 

al(7)    : 

=  fl(8); 

al(6)   : 

=  fl(7); 

al(5)    : 

=  fl(6); 

al(4)   : 

=  fl(5); 

al(3)   : 

=  fl(4); 

al(2)   : 

=  fl(3); 

al(l)   : 

=  n(2); 

al(0)   : 

=  fl(l); 

blO  < 

=  al; 

bll  < 

=  f2; 

if  f3(15) 

=  '0'  then 

a2(15)  : 

=  '0'; 

else 

a2(15)  : 

=  '1'; 

end  if; 

a2(14) 

:=  f3(15) 

a2(13) 

:=  f3(14) 

a2(12) 

:=  f3(13) 

a2(ll) 

:=  13(12) 
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a2(10) 

a2(9) 

a2(8) 

a2(7) 

22(6) 

a2(5) 

a2(4) 

a2(3) 

a2(2) 

a2(l) 

a2(0) 

b20  < 

b21  < 

if  f5(15) 
a3(15)  : 
else 

a3(15)  : 
end  if; 
a3(14) 
a3(13) 
a3(12) 
a3(ll) 
a3(10) 
a3(9) 
a3(8) 
a3(7) 
a3(6) 
a3(5) 
a3(4) 
a3(3) 
a3(2) 
a3(l) 
a3(0) 
b30  < 
b31  < 


••  f3(ll) 
f3(10) 
f3(9) 
f3(8) 
f3(7) 
f3(6) 
13(5) 
f3(4) 

=  13(3) 
f3(2) 
f3(l) 

a2; 

f4; 


'0'  then 
'0'; 


=  '1'; 

=  f5(15) 
=  f5(14) 

=   f5(13) 

=  f5(12) 

=  f5(ll) 
f5(10) 
f5(9) 
f5(8) 
f5(7) 
f5(6) 
f5(5) 
f5(4) 
f5(3) 
f5(2) 
f5(l) 

a3; 

f6; 


if  f7(15)  =  '0'  then 
a4(15)  :=  '0'; 
else 
a4(15)  :=  T; 

end  if; 
a4(14)  :=  f7(15); 
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a4(13) 

a4(12) 

a4(ll) 

a4(10) 

a4(9) 

a4(8) 

a4(7) 

a4(6) 

a4(5) 

a4(4) 

a4(3) 

a4(2) 

a4(l) 

a4(0) 

b40  < 

b41  < 


=  f7(14) 
•  H(13) 

=    f7(12) 

■  H(ll) 

n(io) 

H(9) 
H(8) 
f7(7) 
f7(6) 
f7(5) 
f7(4) 
H(3) 
«7<2) 
H(l) 
a4; 


if  f9(15) 

a5(15)  : 

else 

a5(15)  : 
end  if; 
a5(14) 
a5(13) 
a5(12) 
a5(ll) 
a5(10) 
a5(9) 
a5(8) 
a5(7) 
a5(6) 
a5(5) 
a5(4) 
a5(3) 
a5(2) 
a5(l) 
a5(0) 
b50  < 
b51  < 


'0'  then 
'0'; 


=  '1 


=  f9(15) 

=    f9(14) 
:  f9(13) 

=  f9(12) 

=  f9(ll) 
f9(10) 
f9(9) 
f9(8) 
(9(7) 
f9(6) 
f9(5) 
f9(4) 
f9(3) 
f9(2) 
f9(l) 

a5; 

flO; 


if  fll(15)  =  '0'  then 
a6(15)  :=  '0'; 
else 
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a6(15) 

end  if; 
a6(14)  :  = 
a6(13)  := 
a6(12)  :  = 
a6(ll)  :  = 
a6(10)  :  = 
a6(9)  :  = 
a6(8)  :  = 
a6(7)  :  = 
a6(6)  :  = 
a6(5)  :  = 
a6(4)  :  = 
a6(3)  :  = 
a6(2)  :  = 
a6(l)  :  = 
a6(0)  :  = 
b60  <  = 
b61  <  = 


=  '1'; 


=  fll(15) 

:  ni(i4) 
:  ni(i3) 

:    fll(12) 
:   fll(ll) 

fll(10) 
fll(9) 
fll(8) 
fll(7) 

ni(6) 

fll(5) 
fll(4) 
fll(3) 
fll(2) 
fll(l) 
a6; 
fl2; 


if  fl3(15) 

a7(15)  :  = 

else 

a7(15)  :  = 
end  if; 
a7(14)  :  = 
a7(13)  :  = 
a7(12)  :  = 
a7(ll)  :  = 
a7(10)  :  = 
a7(9)  :  = 
a7(8)  :  = 
a7(7)  := 
a7(6)  :  = 
a7(5)  :  = 
a7(4)  :  = 
a7(3)  :  = 
a7(2)  :  = 
a7(l)  :  = 
a7(0)  :  = 
b70  <  = 
b71  <  = 


=  '0'  then 


>n>. 


0'; 
'l'; 

fl3(15) 

fl3(14) 

fl3(13) 

fl3(12) 

fl3(ll) 

fl3(10) 

fl3(9) 

fl3(8) 

fl3(7) 

fi3«s) 

fl3(5) 
fl3(4) 
fl3(3) 
n3(2) 
fl3(l) 
a7; 
fl4; 
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if  fl5(15)  =  '0'  then 
a8(15)  :=  '0'; 
else 

a8(15)  :=  '1'; 
end  if; 
a8(14)  :=  fl5(15); 
a8(13)  :=  fl5(14); 
a8(12)  :=  fl5(13); 
a8(ll)  :=  fl5(12); 
a8(10)  :=  n5(ll); 
a8(9)   :=  fl5(10); 
a8(8)    :=  fl5(9); 
a8(7)   :=  fl5(8); 
a8(6)    :=  fl5(7); 
a8(5)    :=  fl5(6); 
a8(4)   :=  fl5(5); 
a8(3)    :=  fl5(4); 
a8(2)    :=  fl5(3); 
a8(l)    :=  fl5(2); 
a8(0)    :=  fl5(l); 
b80  <  =  a8; 
b81  <  =  fl6; 
wait  on  n,f2,f3,f4,f5,f6,n,f8,f9,fl0,ni,fl2,fl3,fl4,fl5,n6,CLK; 
end  process; 
end  beh; 

Package  1 

package  packl  is 

procedure  bi_to_in  —change  16  bitsd  sign,l  integer  and  14  fraction  into  real) 
(variable  x  :  bit_vector(15  downto  0); 
variable  y  :  <  ut  integer); 
procedure  in_to_bi  —change  real  into  binary (1  sign,l  integer,  14  fractions), 
(variable  m  :  in  integer; 
variable  n  :  out  bit_vector(15  downto  0)); 
end  packl; 

package  body  packl  is 
procedure  bi_to_in 

(variable  x  :  bit_vector(15  downto  0); 
variable  y  :  out  integer)  is 
variable  sum  :  integer  :=0; 
variable  p  :  bit_vector(15  downto  0); 
begin 

D  !  —  X* 

if  p(15)  =  *V  then 
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for  i  in  0  to  14  loop 
if  p(i)  =  '1' then 

for  i  in  0  to  13  loop 
p(i+l)  :=  not  p(i+l); 
end  loop;  exit; 
end  if; 
end  loop; 
for  k  in  0  to  14  loop 
if  p(k)  =  'V  then 
sum  :=  sum  +  2**k; 
end  if; 
end  loop; 
y  :=  -sum; 
else 
for  1  in  0  to  14  loop 
if  p(l)  =  'V  then 

sum  :=  sum  +  2**1; 
end  if; 
end  loop; 
y  :=  sum; 
end  if; 
end  bitoin; 


procedure  in_to_bi 

(variable  m  :  in  integer; 
variable  n  :  out  bit_vector(15  downto  0))  is 
variable  temp_a  :  integer  :  =  0; 
variable  tempb  :  integer  :  =  0; 
variable  w  :  bit_vector(15  downto  0); 
begin 

if  m  <  0  then 

tempa  :=  -m; 
else 

temp_a  :=  m; 
end  if; 

for  i  in  14  downto  0  loop 
tempb  :=  temp_a/(2**i); 
tempa  :=  tempa  rem  (2**i); 
if  (temp_b  =  1)  then 


w(i)  :  = 

'1'; 

else 

w(i)  :  = 

'0'; 

end  if; 
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end  loop; 
if  m  >  0  then 
w(15)  :=  '0'; 
else 
w(15)  :=  '1'; 
for  k  in  0  to  14  loop 
if  w(k)  =  '1'  then 

for  k  in  0  to  13  loop 
w(k+l)  :=  not  w(k+l); 
end  loop;  exit; 
end  if; 
end  loop; 
end  if; 


—  prevent  negative  zero  occurs. 

if  w(14)  =  '0'  and  w(13)  =  '0'  and  w(12)  =  '0'  and  w(ll)  =  '0'  and 
w(10)  =  '0'  and 

w(9)  =  '0'  and  w(8)  =  '0'  and  w(7)  =  '0'  and  w(6)  =  '0'  and  w(5)  =  '0'  and 
W(4)  =  '0'  and  w(3)  =  '0'  and  w(2)  =  '0'  and  w(l)  =  '0'  and  w(0)  =  '0'  then 
w(15)  :=  '0'; 
end  if; 
n  :=  w; 
end  intobi; 
end  packl; 

16-bit  adder_g 

use  work. packl. all; 
entity  addg  is 
Port(al,a2,a3,a4,a5,a6,a7,a8,a9,al0,all,al2,al3,al4,al5,al6: 
bit_vector(15  downto  0); 

bl,b2,b3,b4,b5,b6,b7,b8  :  out  bit_vector(15  downto  0); 
CLK,as  :  bit); 
end  addg; 

architecture  beh  of  add_g  is 
begin 
process 
variable  Xl,x2,x3,x4,x5,x6,x7,x8,x9,xl0,xll,xl2,xl3,xl4,xl5,xl6, 

nl,n2,n3,n4,n5,n6,n7,n8  :  bit_vector(15  downto  0); 
variable  yl,y2,y3,y4,y5,y6,y7,y8,y9,yl0,yll,yl2,yl3,yl4,yl5,yl6, 

ml,m2,m3,m4,m5,m6,ni7,m8  :  integer  :=  0; 
begin 
wait  until  CLK'event  and  CLK  =  '1'; 
xl  :=  al;  x2  :=  a2;   x3  :=  a3;  x4  :=  a4; 
x5  :=  a5;   x6  :=  a6;   x7  :=  a7;   x8  :=  a8; 
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x9  :=  a9;  xlO  :=  alO;  xll  :=  all;  xl2  :=  al2; 
xl3  :=  al3;  xl4  :=  al4;  xl5  :=  al5;  xl6  :=  al6; 
bi_to_in(xl  ,y  1)  ;bi_to_in(x2,y2)  ;bi_to_in(x3,y3)  ;bi_to_in(x4,y4) ; 
bi_to_in(x5,y5)  ;bi_to_in(x6,y6)  ;bi_to_in(x7,y7)  ;bi_to_in(x8,y8) ; 
bi_to_in(x9,y9)  ;bi_to_in(xlO,y  10)  ;bi_to_in(xl  1  ,y  1 1) ; 
bi_to_in(xl2,yl2); 

bi_to_in(xl3,yl3);bi_to_in(xl4,yl4);bi_to_in(xl5,yl5); 
bi_to_in(xl6,yl6); 
if  as  =  '0'  then 

ml  :=  yl  +  y2;  m2  :=  y3  +  y4;  m3  :=  y5  +  y6;  m4  :=  y7  +  y8; 
m5  :=  y9  +  ylO;  m6  :=  yll  +  yl2;  m7  :=  yl3  +  yl4;  m8  :=  yl5  +  yl6; 
else 

ml  :=  yl  -  y2;  m2  :=  y3  -  y4;  m3  :=  y5  -  y6;  m4  :=  y7  -  y8; 
m5  :=  y9  -  ylO;  m6  :=  yll  -  yl2;  m7  :=  yl3  -  yl4;  m8  :=  yl5  -  yl6; 
end  if; 

in_to_bi(ml,nl);  in_to_bi(m2,n2);  in_to_bi(m3,n3);  in_to_bi(m4,n4); 
in_to_bi(m5,n5);  in_to_bi(m6,n6);  in_to_bi(m7,n7);  in_to_bi(m8,n8); 
bl  <  =  nl;   b2  <  =  n2;   b3  <  =  n3;   b4  <  =  n4; 
b5  <  =  n5;   b6  <  =  n6;   b7  <  =  n7;   b8  <  =  n8; 
wait  on  al,a2,a3,a4,a5,a6,a7,a8,a9,al0,all,al2,al3,al4,al5,al6,CLK; 
end  process; 
end  beh; 

Register_h 

entity  regh  is 
port(a0,al,a2,a3,a4,a5,a6,a7  :  bit_vector(15  downto  0); 
D0,bl,b2,b3,b4,b5,b6,b7 :  out  bit_vector(15  downto  0); 
CLK  :  bit); 
end  reg_h; 

architecture  beh  of  regh  is 
begin 
process 

variable  d0,dl,d2,d3,d4,d5,d6,d7  :  bit_vector(15  downto  0); 
begin 


dO: 

:=  aO; 

dl 

:=al; 

d2 

:=  a2; 

d3 

:=  a3; 

d4 

:=  a4; 

65 

:=  a5; 

d6 

:=  a6; 

d7 

:=a7; 

wait  until  CLK'event  and  CLK  =  '1'; 

bO 

<  =  dO; 
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bl  <=  dl; 
b2  <  =  d2; 
b3  <  =  d3; 
b4  <  =  d4; 
b5  <  =  d5; 
b6  <  =  d6; 
b7  <  =  d7; 
wait  on  CLK; 
end  process; 
end  beh; 

Adder_i 

use  work. pack  1. all; 
entity  addi  is 
Port(al,a2,a3,a4,a5,a6,a7,a8,a9,al0,all,al2,al3,al4,al5,al6: 
bit_vector(15  downto  0); 

bl,b2,b3,b4,b5,b6,b7,b8  :  out  bit_vector(15  downto  0); 
CLK  :  bit); 
end  addi; 

architecture  beh  of  addi  is 
begin 
process 
variable  Xl,x2,x3,x4,x5,x6,x7,x8,x9,xl0,xll,xl2,xl3,xl4,xl5,xl6, 

nl,n2,n3,n4,n5,n6,n7,n8  :  bit_vector(15  downto  0); 
variable  yl,y2,y3,y4,y5,y6,y7,y8,y9,yl0,yll,yl2,yl3,yl4,yl5,yl6, 

ml,m2,m3,m4,m5,m6,m7,m8  :  integer  :=  0; 
begin 
xl  :=  al;   x2  :=  a2;   x3  :=  a3;   x4  :=  a4; 
x5  :=  a5;   x6  :=  a6;  x7  :=  a7;  x8  :=  a8; 
x9  :=  a9;  xlO  :=  alO;  xll  :=  all;  xl2  :=  al2; 
xl3  :=  al3;  xl4  :=  al4;  xl5  :=  al5;  xl6  :=  al6; 
bi_to_in(xl,yl);bi_to_in(x2,y2);bi_to_in(x3,y3);bi_to_in(x4,y4); 
bi_to_in(x5,y5);bi_to_in(x6,y6);bi_to_in(x7,y7);bi_to_in(x8,y8); 
bi_to_in(x9,y9);bi  to  in(xl0,yl0);bi  to_in(xll,yll); 
bi_to_in(xl2,yl2); 

bi_to_in(xl3,yl3);bi_to_in(xl4,yl4);bi_to_in(xl5,yl5); 
bi_to_in(xl6,yl6); 

ml  :=  yl  +  y2;  m2  :=  y3  +  y4;  m3  :=  y5  +  y6;  m4  :=  y7  +  y8; 
m5  :=  y9  +  ylO;  m6  :=  yll  +  yl2;  m7  :=  yl3  +  yl4;  m8  :=  yl5  +  yl6; 

in_to_bi(ml,nl);  in_to_bi(m2,n2);  in_to_bi(m3,n3);  in_to_bi(m4,n4); 
in_to_bi(m5,n5);  in_to_bi(m6,n6);  in_to_bi(m7,n7);  in_to_bi(m8,n8); 

bl  <  =  nl;   b2  <  =  n2;   b3  <  =  n3;   b4  <  =  n4; 
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b5  <  =  n5;   b6  <  =  n6;   b7  <  =  n7;   b8  <  =  n8; 

wait  on  al,a2,a3,a4,a5,a6,a7,a8,a9,al0,all,al2,al3,al4,al5,al6; 
end  process; 
end  beh; 
Shift  right  2-bit  register 

entity  shi_2  is 
port(al,a2,a3,a4,a5,a6,a7,a8  :  bit_vector(15  downto  0); 
Srl,sr2,sr3,sr4,sr5,sr6,sr7,sr8,bl,b2,b3,b4,b5,b6,b7,b8 : 
out  bit_vector(  15  downto  0);clr  :  bit_vector(15  downto  0); 
CLK  :  bit); 
end  shi_2; 

architecture  beh  of  shi_2  is 
begin 
process 
variable  Xl,x2,x3,x4,x5,x6,x7,x8,yl,y2,y3,y4,y5,y6,y7,y8 : 

bit_vector(15  downto  0); 
variable  i  :  integer  :=  0; 
begin 
wait  until  CLK'event  and  CLK  =  »1»; 
xl  :=  al;  x2  :=  a2;  x3  :=  a3;  x4  :=  a4; 
x5  :=  a5;  x6  :=  a6;  x7  :=  a7;  x8  :=  a8; 
if  xl(15)  =  '0'then 
yl(13)  :=  xl(15);  yl(12)  :=  xl(14);  yl(ll)  :=  xl(13); 
yl(10)  :=  xl(12);  yl(9)  :=  Xl(ll);  Yl(8)  :=  Xl(10); 
yl(7)  :=  Xl(9);  yl(6)  :=  xl(8);  yl(5)  :=  xl(7); 
yl(4)  :=  xl(6);  yl(3)  :=  xl(5);  yl(2)  :=  xl(4); 
yl(l)  :=  xl(3);  yl(0)  :=  xl(2);  yl(14)  :=  '0'; 
yl(15)  :=  '0'; 
else 

yl(13)  :=  xl(15);  yl(12)  :=  xl(14);  yl(ll)  :=  xl(13); 
yl(10)  :=  xl(12);  yl(9)  :=  Xl(ll);  Yl(8)  :=  Xl(10); 
yl(7)  :=  Xl(9);  yl(6)  :=  xl(8);  yl(5)  :=  xl(7); 
yl(4)  :=  xl(6);  yl(3)  :=  xl(5);  yl(2)  :=  xl(4); 
yl(l)  :=  xl(3);  yl(0)  :=  xl(2);  yl(14)  :=  '1'; 
yl(15)  :=  '1'; 
end  if; 


if  x2(15)  =  '0'then 

y2(13)  :=  x2(15);  y2(12)  :=  x2(14);  y2(ll)  :=  x2(13); 
y2(10)  :=  x2(12);  y2(9)  :=  X2(ll);  Y2(8)  :=  X2(10); 
y2(7)  :=  X2(9);  y2(6)  :=  x2(8);  y2(5)  :=  x2(7); 
y2(4)  :=  x2(6);  y2(3)  :=  x2(5);  y2(2)  :=  x2(4); 
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y2(l)  :=  x2(3);  y2(0)  :=  x2(2);  y2(14)  :=  '0'; 

y2(15)  :=  '0'; 
else 

y2(13)  :=  x2(15);  y2(12)  :=  x2(14);  y2(ll)  :=  x2(13); 

y2(10)  :=  x2(12);  y2(9)  :=  X2(ll);  Y2(8)  :=  X2(10); 

y2(7)  :=  X2(9);  y2(6)  :=  x2(8);  y2(5)  :=  x2(7); 

y2(4)  :=  x2(6);  y2(3)  :=  x2(5);  y2(2)  :=  x2(4); 

y2(l)  :=  x2(3);  y2(0)  :=  x2(2);  y2(14)  :=  '1'; 

y2(15)  :=  '1'; 
end  if; 

ifx3(15)  =  '0'then 

y3(13)  :=  x3(15);  y3(12)  :=  x3(14);  y3(ll)  :=  x3(13); 

y3(10)  :=  x3(12);  y3(9)  :=  X3(ll);  y3(8)  :=  x3(10); 

y3(7)  :=  X3(9);  y3(6)  :=  x3(8);  y3(5)  :  =  x3(7); 

y3(4)  :=  x3(6);  y3(3)  :=  x3(5);  y3(2)  :=  x3(4); 

y3(l)  :=  x3(3);  y3(0)  :=  x3(2);  y3(14)  :=  '0'; 

y3(15)  :=  '0'; 
else 

y3(13)  :=  x3(15);  y3(12)  :=  x3(14);  y3(ll)  :=  x3(13); 

y3(10)  :=  x3(12);  y3(9)  :=  X3(ll);  Y3(8)  :=  X3(10); 

y3(7)  :=  X3(9);  y3(6)  :=  x3(8);  y3(5)  :=  x3(7); 

y3(4)  :=  x3(6);  y3(3)  :=  x3(5);  y3(2)  :=  x3(4); 

y3(l)  :=  x3(3);  y3(0)  :=  x3(2);  y3(14)  :=  '1'; 

y3(15)  :=  '1'; 
end  if; 

if  x4(15)  =  '0'  then 

y4(13)  :=  x4(15);  y4(12)  :=  x4(14);  y4(ll)  :=  x4(13); 

y4(10)  :=  x4(12);  y4(9)  :=  X4(ll);  y4(8)  :=  x4(10); 

y4(7)  :=  X4(9);  y4(6)  :=  x4(8);  y4(5)  :=  x4(7); 

y4(4)  :=  x4(6);  y4(3)  :=  x4(5);  y4(2)  :=  x4(4); 

y4(l)  :=  x4(3);  y4(0)  :=  x4(2);  y4(14)  :=  '0'; 

y4(15)  :=  '0'; 
else 

y4(13)  :=  x4(15);  y4(12)  :=  x4(14);  y4(ll)  :=  x4(13); 

y4(10)  :=  x4(12);  y4(9)  :=  X4(ll);  Y4(8)  :=  X4(10); 

y4(7)  :=  X4(9);  y4(6)  :=  x4(8);  y4(5)  :=  x4(7); 

y4(4)  :=  x4(6);  y4(3)  :=  x4(5);  y4(2)  :=  x4(4); 

y4(l)  :=  x4(3);  y4(0)  :=  x4(2);  y4(14)  :=  '1'; 

y4(15)  :=  'V; 
end  if; 
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ifx5(15)  =  '0'then 

y5(13)  :=  x5(15);  y5(12)  :=  x5(14);  y5(ll)  :=  x5(13); 

y5(10)  :=  x5(12);  y5(9)  :=  X5(ll);  y5(8)  :=  x5(10); 

y5(7)  :=  X5(9);  y5(6)  :=  x5(8);  y5(5)  :=  x5(7); 

y5(4)  :=  x5(6);  y5(3)  :=  x5(5);  y5(2)  :=  x5(4); 

y5(l)  :=  x5(3);  y5(0)  :=  x5(2);  y5(14)  :=  '0'; 

y5(15)  :=  '0'; 
else 

y5(13)  :=  x5(15);  y5(12)  :=  x5(14);  y5(ll)  :=  x5(13); 

y5(10)  :=  x5(12);  y5(9)  :=  x5(ll);  y5(8)  :=  x5(10); 

y5(7)  :=  X5(9);  y5(6)  :=  x5(8);  y5(5)  :=  x5(7); 

y5(4)  :=  x5(6);  y5(3)  :=  x5(5);  y5(2)  :=  x5(4); 

y5(l)  :=  x5(3);  y5(0)  :=  x5(2);  y5(14)  :=  T; 

y5(15)  :=  '1'; 
end  if; 

if  x6(15)  =  ,0'then 

y6(13)  :=  x6(15);  y6(12)  :=  x6(14);  y6(ll)  :=  x6(13); 

y6(10)  :=  x6(12);  y6(9)  :=  X6(ll);  y6(8)  :=  x6(10); 

y6(7)  :=  X6(9);  y6(6)  :=  x6(8);  y6(5)  :=  x6(7); 

y6(4)  :=  x6(6);  y6(3)  :=  x6(5);  y6(2)  :=  x6(4); 

y6(l)  :  =  x6(3);  y6(0)  :=  x6(2);  y6(14)  :=  '0'; 

y6(15)  :=  '0'; 
else 

y6(13)  :=  x6(15);  y6(12)  :=  x6(14);  y6(ll)  :=  x6(13); 

y6(10)  :=  x6(12);  y6(9)  :=  x6(ll);  y6(8)  :=  x6(10); 

y6(7)  :=  X6(9);  y6(6)  :=  x6(8);  y6(5)  :=  x6(7); 

y6(4)  :=  x6(6);  y6(3)  :=  x6(5);  y6(2)  :=  x6(4); 

y6(l)  :=  x6(3);  y6(0)  :=  x6(2);  y6(14)  :=  '1'; 

y6(15)  :=  '1'; 
end  if; 

ifx7(15)  =  '0'then 

y7(13)  :=  x7(15);  y7(12)  :=  x7(14);  y7(ll)  :=  x7(13); 
y7(10)  :=  x7(12);  y7(9)  :=  X7(ll);  y7(8)  :=  x7(10); 
y7(7)  :=  X7(9);  y7(6)  :=  x7(8);  y7(5)  :=  x7(7); 
y7(4)  :=  x7(6);  y7(3)  :=  x7(5);  y7(2)  :=  x7(4); 
y7(l)  :=  x7(3);  y7(0)  :=  x7(2);  y7(14)  :=  '0'; 
y7(15)  :=  '0'; 
else 

y7(13)  :=  x7(15);  y7(12)  :=  x7(14);  y7(ll)  :=  x7(13); 
y7(10)  :=  x7(12);  y7(9)  :=  x7(ll);  y7(8)  :=  x7(10); 
y7(7)  :=  X7(9);  y7(6)  :=  x7(8);  y7(5)  :=  x7(7); 
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y7(4)  :=  x7(6);  y7(3)  :=  x7(5);  y7(2)  :=  x7(4); 
y7(l)  :=  x7(3);  y7(0)  :=  x7(2);  y7(14)  :=  '1'; 
y7(15)  :=  '1'; 
end  if; 

if  x8(15)  =  '0'then 

y8(13)  :=  x8(15);  y8(12)  :=  x8(14);  y8(ll)  :=  x8(13); 

y8(10)  :=  x8(12);  y8(9)  :=  X8(ll);  y8(8)  :=  x8(10); 

y8(7)  :=  X8(9);  y8(6)  :=  x8(8);  y8(5)  :=  x8(7); 

y8(4)  :=  x8(6);  y8(3)  :=  x8(5);  y8(2)  :=  x8(4); 

y8(l)  :=  x8(3);  y8(0)  :=  x8(2);  y8(14)  :=  '0'; 

y8(15)  :=  '0'; 
else 

y8(13)  :=  x8(15);  y8(12)  :=  x8(14);  y8(ll)  :=  x8(13); 

y8(10)  :=  x8(12);  y8(9)  :=  x8(ll);  y8(8)  :=  x8(10); 

y8(7)  :=  X8(9);  y8(6)  :=  x8(8);  y8(5)  :=  x8(7); 

y8(4)  :=  x8(6);  y8(3)  :=  x8(5);  y8(2)  :=  x8(4); 

y8(l)  :=  x8(3);  y8(0)  :=  x8(2);  y8(14)  :=  '1'; 

y8(15)  :=  '1'; 
end  if; 

srl  <  =  yl;  sr2  <  =  y2;  sr3  <  =  y3;  sr4  <  =  y4; 
sr5  <  =  y5;  sr6  <  =  y6;  sr7  <  =  y7;  sr8  <  =  y8; 
i:=  i+1; 
if  i  =  6  then 

bl  <  =  yl;  b2  <  =  y2;  b3  <  =  y3;  b4  <  =  y4; 
b5  <  =  y5;  b6  <  =  y6;  b7  <  =  y7;  b8  <  =  y8; 
xl  :=  clr;  x2  :=  clr;  x3  :=  clr;  x4  :=  clr; 
x5  :=  clr;  x6  :=  clr;  x7  :=  clr;  x8  :=  clr; 
srl  <  =  clr;  sr2  <  =  clr;  sr3  <  =  clr;  sr4  <  =  clr; 
sr5  <  =  clr;  sr6  <  =  clr;  sr7  <  =  clr;  sr8  <  =  clr; 
i  :=  0; 
end  if; 
wait  on  al,a2,a3,a4,a5,a6,a7,a8,clr,CLK; 
end  process; 
end  beh; 

Result  output 

entity  result  is 
port(al,a2,a3,a4,a5,a6,a7,a8  :  bit_vector(15  downto  0); 
k  :  out  bit_vector(15  downto  0);CLK  :  bit); 
end  result; 

architecture  beh  of  result  is 
type  r  is  array (0  to  7)  of  bit_vector(15  downto  0); 
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begin 
process 
variable  x  :  r; 
begin 
x(0)  :=  al;  x(l)  :=  a2;  x(2)  :=  a3;  x(3)  :=  a4; 
x(4)  :=  a5;  x(5)  :=  a6;  x(6)  :=  a7;  x(7)  :=  a8; 
for  i  in  0  to  7  loop 

wait  until  CLK'event  and  CLK  =  '1'; 
k  <  =  x(i); 
end  loop; 
wait  on  al,a2,a3,a4,a5,a6,a7,a8,CLK; 
end  process; 
end  beh; 

Test  bench 

use  work. packl. all; 
entity  test  is  end  test; 
architecture  str  of  test  is 
component  clockge  port(CLCK  :inout  bit); 
end  component; 

component  clock  port(CLK  :inout  bit); 
end  component; 

component  control  port  (CLK  :  bit;ct :  out  bit); 
end  component; 
component  LOAD  port(AI :  in  bit_vector(ll  downto  0); 

B0,B1,B2,B3,B4,B5,B6,B7  :  out  bit_vector(ll  downto  0); 
CLK  :  in  bit); 
end  component; 
component  shift 

port(bi0,bil,bi2,bi3,bi4,bi5,bi6,bi7:  in  bit_vector(ll  downto  0); 
bo0,bol,bo2,bo3,bo4,bo5,bo6,bo7:  out  bit_vector(l  downto  0); 
CLK  :  in  bit); 
end  component; 
component  adsu 

port(a0,al,a2,a3,a4,a5,a6,a7 :  bit_vector(l  downto  0); 
D0,bl,b2,b3,b4,b5,b6,b7 :  out  bit  vectord  downto  0); 
CLK,cr,st  :  bit); 
end  component; 
component  reg 

port(a0,al,a2,a3,a4,a5,a6,a7  :  bit_vector(l  downto  0); 
b0,bl,b2,b3,b4,b5,b6,b7:  out  bit_vector(l  downto  0); 
CLK  :  bit); 
end  component; 
component  rom 
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port(e0,el,e2,e3,e4,e5,e6,e7  :  bit_vector(l  downto  0); 

bl0,bll,b20,b21,b30,b31,b40,b41,b50,b51,b60,b61,b70,b71,b80,b81: 
out  bit_vector(15  downto  0); 
CLK  :  bit); 
end  component; 
component  shi_l 

Port(fl,f2,0,f4,f5,f6,n,f8,f9,n0,fll,n2,n3,n4,n5,n6: 
bit_vector(15  downto  0); 

bl0,bll,b20,b21,b30,b31,b40,b41,b50,b51,b60,b61,b70,b71,b80,b81 
out  bit_vector(15  downto  0); 
CLK  :  bit); 
end  component; 
component  delay  1 

port(a:  bit;b:  out  bit;CLK:  bit); 
end  component; 
component  delay2 

port  (a:  bit;b:  out  bit;  CLK:  bit); 
end  component; 
component  delay3 

port  (a:  bit;b:  out  bit;CLK:  bit); 
end  component; 
component  delay 4 

port(a:  bit;b:  out  bit;CLK:  bit); 
end  component; 
component  delayS 

port  (a:  bit;b:  out  bit;CLK:  bit); 
end  component; 
component  delay6 

port(a:  bit;b:  out  bit;CLK:  bit); 
end  component; 
component  delay7 

portia:  bit;b:  out  bit;CLK:  bit); 
end  component; 
component  delay8 

port  (a:  bit;b:  out  bit;CLK:  bit); 
end  component; 
component  delay0 

port(a:  bit;b:  out  bit;CLK:  bit); 
end  component; 
component  delay  10 

port(a:  bit;b:  out  bit;CLK:  bit); 
end  component; 
component  addg 
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Port(al,a2,a3,a4,a5,a6,a7,a8,a9,al0,all,al2,al3,al4,al5,al6: 

bit_vector(15  downto  0); 

bl,b2,b3,b4,b5,b6,b7,b8  :  out  bit_vector(15  downto  0); 

CLK,as  :  bit); 
end  component; 
component  reg_h 

port(a0,al,a2,a3,a4,a5,a6,a7  :  bit_vector(15  downto  0); 

D0,bl,b2,b3,b4,b5,b6,b7:  out  bit_vector(15  downto  0); 

CLK  :  bit); 
end  component; 
component  addi 
Port(al,a2,a3,a4,a5,a6,a7,a8,a9,al0,all,al2,al3,al4,al5,al6: 

bit_vector(15  downto  0);bl,b2,b3,b4,b5,b6,b7,b8 : 

out  bit_vector(15  downto  0);CLK  :  bit); 
end  component; 
component  shi_2 

port(al,a2,a3,a4,a5,a6,a7,a8  :  bit_vector(15  downto  0); 

Srl,sr2,sr3,sr4,sr5,sr6,sr7,sr8,bl,b2,b3,b4,b5,b6,b7,b8 : 

out  bit_vector(15  downto  0);clr  :  bit_vector(15  downto  0); 

CLK  :  bit); 
end  component; 
component  result 

port(al,a2,a3,a4,a5,a6,a7,a8  :  bit_vector(15  downto  0); 

k  :  out  bit_vector(15  downto  0);  CLK  :  bit ); 
end  component; 

for  C:  clockge  use  entity  work.clock_ge(clk_ctI); 
for  ad:  clock  use  entity  work.clock(beh); 
for  a  :  control  use  entity  work.control(beh); 
for  L  :  LOAD  use  entity  work.LOAD(BEH); 
for  S  :  shift  use  entity  work.shift(beh); 
for  D  :  adsu  use  entity  work.adsu(beh); 
for  r  :  reg  use  entity  work.reg(beh); 
for  o  :  rom  use  entity  work. rom (ben); 
for  s_l  :  shi_l  use  entity  work.shi_l(beh); 
for  b  :  delayl  use  entity  work.delayl(beh); 
for  e  :  delay2  use  entity  work.delay2(beh); 


for  dely3 
for  dely4 
for  dely5 
for  dely6 
for  dely7 
for  dely8 
for  dely9 


delay3  use  entity  work.delay3(beh) 
delay4  use  entity  work.delay4(beh) 
delayS  use  entity  work.delay5(beh) 
delay6  use  entity  work.delay6(beh) 
delay7  use  entity  work.delay7(beh) 
delay8  use  entity  work.delay8(beh) 
delay9  use  entity  work. delay 9 (ben) 
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for  delylO  :  delay  10  use  entity  work. delay  10(beh); 

for  g  :  addg  use  entity  work. add_g  (ben); 

for  h  :  reg_h  use  entity  work.regh(beh); 

for  i  :  add_i  use  entity  work.add_i(beh); 

for  j  :  shi_2  use  entity  work.shi_2(beh); 

for  t  :  result  use  entity  work. result (beh); 

signal  di  :  bit_vector(ll  downto  0); 

signal  ck  :  bit; 

signal  clck  :  bit; 

signal  go  :  bit; 

signal  io  :  bit; 

signal  ho  :  bit; 

signal  te  :  bit; 

signal  de  :  bit; 

signal  ab  :  bit; 

signal  cd  :  bit; 

signal  ef  :  bit; 

signal  gh  :  bit; 

signal  ij  :  bit; 

signal  kl  :  bit; 

signal  d0,dl,d2,d3,d4,d5,d6,d7  :  bit  vector(ll  downto  0); 

Signal  So0,sol,so2,so3,so4,so5,so6,so7  :  bit_vector(l  downto  0); 

signal  co0,col,co2,co3,co4,co5,co6,co7  :  bit_vector(l  downto  0); 

signal  do0,dol,do2,do3,do4,do5,do6,do7:  bit_vector(l  downto  0); 

signal  clr  :  bit  :  =  '0'; 

signal  set  :  bit  :  =  '0'; 

signal  el,e2,e3,e4,e5,e6,e7,e8,e9,el0,ell,el2,el3,el4,el5,el6 : 

bit_vector(15  downto  0); 
signal  fl,f2,f3,f4,f5,f6,n,f8,f9,fl0,fll,fl2,fl3,fl4,n5,fl6: 

bit_vector(15  downto  0); 
signal  gl,g2,g3,g4,g5,g6,g7,g8  :  bit_vector(15  downto  0); 

signal  hl,h2,h3,h4,h5,h6,h7,h8  :  bit_vector(15  downto  0); 
signal  il,i2,i3,i4,i5,i6,i7,i8  :  bit_vector(15  downto  0); 
signal  J1J2J3J4J5J6J7J8  :  bit_vector(15  downto  0); 

signal  rl,r2,r3,r4,r5,r6,r7,r8  :  bit_vector(15  downto  0); 
signal  cr  :  bit_vector(15  downto  0)  :=  "0000000000000000"; 
signal  p  :  bit_vector(15  downto  0); 

begin 

C  :  clock_ge  port  map(ck); 

ad  :  clock  port  map  (clck); 

a  :  control  port  map(ck,go); 

b  :  delay  1  port  map(go,io,ck); 

e  :  delay2  port  map(ck,ho,clck); 
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dely3  :  delay3  port  map(ho,te,clck); 
dely4  :  delay4  port  map(te,de,clck); 
dely5  :  delayS  port  map(de,ab,clck); 
dely6  :  delay6  port  map(ab,cd,clck); 
dely7  :  delay 7  port  map(cd,ef,clck); 
dely8  :  delay8  port  map(ef,gh,clck); 
dely9  :  delay9  port  map(gh,ij,clck); 
delylO  :  delay  10  port  map(ij,kl,clck); 
L  :  LOAD  port  map(di,d0,dl,d2,d3,d4,d5,d6,d7,ck); 
S  :  shift  port  map(d0,dl,d2,d3,d4,d5,d6,d7, 

So0,sol,so2,so3,so4,so5,so6,so7,ck); 
D  :  adsu  port  map(so0,sol,so2,so3,so4,so5,so6,so7, 
Co0,col,co2,co3,co4,co5,co6,co7, 
ck,clr,set); 
r  :  reg  port  map(co0,col,co2,co3,co4,co5,co6,co7, 
do0,dol,do2,do3,do4,do5,do6,do7, 
ck); 
o  :  rom  port  map(do0,dol,do2,do3,do4,do5,do6,do7, 

el,e2,e3,e4,e5,e6,e7,e8,e9,el0,ell,el2,el3,el4, 
el5,el6,ck); 
si  :  shil  port  Inap(el,e2,e3,e4,e5,e6,e7,e8,e9,el0,ell,el2,el3,el4,el5,el6, 

n,f2,f3,f4,f5,f6,n,f8,f9,fio,ni,n2,n3,n4,n5,n6, 

ck); 
g  :  add_g  port  map(fl,rc,ra,f4,f5,f6,n,f8,f9,n0,m,fl2,fl3,fl4,fl5,fl6, 
gl,g2,g3,g4,g5,g6,g7,g8,ck,io); 
h  :  reg  h  port  map(gl,g2,g3,g4,g5,g6,g7,g8,hl,h2,h3,h4,h5,h6,h7,h8,ck); 

i  :  addi  port  map(hl,rl,h2,r2,h3,r3,h4,r4,h5,r5,h6,r6,h7,r7,h8,r8, 

il,i2,i3,i4,i5,i6,i7,i8,ck); 
j  :  shi_2  port  map(il,i2,i3,i4,i5,i6,i7,i8,rl,r2,r3,r4,r5,r6,r7,r8, 

JlJ2j3j4j5J6J7j8,cr,kl); 
t  :  result  port  map(jl J2J3J4J5 j6J7J8,p,ck); 
set  <  =  '1'  after  5  ns; 
di  <  =  "000101101010"  after  7  ns, 
"000000000000"  after  17  ns, 
"000101101010"  after  27  ns, 
"001011010100"  after  37  ns, 
"000101101010"  after  47  ns, 
"000000000000"  after  57  ns, 
"000101101010"  after  67  ns, 
"001011010100"  after  77  ns; 
end  str; 
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APPENDIX  B.  16-BIT  1-D  DCT  VHDL  SOURCE  CODE 
Shift  right  2-bit  register 


entity  shi_2  is 

port(al,a2,a3,a4,a5,a6,a7,a8  :  bit_vector(15  downto  0); 
Srl,sr2,sr3,sr4,sr5,sr6,sr7,sr8,bl,b2,b3,b4,b5,b6,b7,b8 : 
out  bit_vector(  15  downto  0);clr  :  bit_vector(15  downto  0); 
CLK  :  bit); 
end  shi_2; 

architecture  beh  of  shi_2  is 
begin 
process 
variable  Xl,x2,x3,x4,x5,x6,x7,x8,yl,y2,y3,y4,y5,y6,y7,y8 : 

bit_vector(15  downto  0); 
variable  i  :  integer  :=  0; 
begin 
wait  until  CLK'event  and  CLK  =  '1'; 
xl  :=  al;  x2  :=  a2;  x3  :=  a3;  x4  :=  a4; 
x5  :=  a5;  x6  :=  a6;  x7  :=  a7;  x8  :=  a8; 
ifxl(15)  =  '0'then 
yl(13)  :=  xl(15);  yl(12)  :=  xl(14);  yl(ll)  :=  xl(13); 
yl(10)  :=  xl(12);  yl(9)  :=  Xl(ll);  Yl(8)  :=  Xl(10); 
yl(7)  :=  Xl(9);  yl(6)  :=  xl(8);  yl(5)  :=  xl(7); 
yl(4)  :=  xl(6);  yl(3)  :=  xl(5);  yl(2)  :=  xl(4); 
yl(l)  :=  xl(3);  yl(0)  :=  xl(2);  yl(14)  :=  '0'; 
yl(15)  :=  '0'; 
else 

yl(13)  :=  xl(15);  yl(12)  :=  xl(14);  yl(ll)  :=  xl(13); 
yl(10)  :=  xl(12);  yl(9)  :=  Xl(ll);  Yl(8)  :=  Xl(10); 
yl(7)  :=  Xl(9);  yl(6)  :=  xl(8);  yl(5)  :=  xl(7); 
yl(4)  :=  xl(6);  yl(3)  :=  xl(5);  yl(2)  :=  xl(4); 
yl(l)  :=  xl(3);  yl(0)  :=  xl(2);  yl(14)  :=  '1'; 
yl(15)  :=  '1'; 
end  if; 

if  x2(15)  =  '0'  then 

y2(13)  :=  x2(15);  y2(12)  :=  x2(14);  y2(ll)  :=  x2(13); 
y2(10)  :=  x2(12);  y2(9)  :=  X2(ll);  Y2(8)  :=  X2(10); 
y2(7)  :=  X2(9);  y2(6)  :=  x2(8);  y2(5)  :=  x2(7); 
y2(4)  :=  x2(6);  y2(3)  :=  x2(5);  y2(2)  :=  x2(4); 
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y2(l)  :=  x2(3);  y2(0)  :=  x2(2);  y2(14)  :=  '0'; 
y2(15)  :=  '0'; 
else 

y2(13)  :=  x2(15);  y2(12)  :=  x2(14);  y2(ll)  :=  x2(13); 
y2(10)  :=  x2(12);  y2(9)  :=  X2(ll);  Y2(8)  :=  X2(10); 


y2(7) 

y2(4) 
y2(l) 


=  X2(9);  y2(6)  :=  x2(8);  y2(5)  :=  x2(7); 
=  x2(6);  y2(3)  :=  x2(5);  y2(2)  :=  x2(4); 
=  x2(3);  y2(0)  :=  x2(2);  y2(14)  :=  '1'; 


y2(15)  :=  '1'; 
end  if; 


ifx3(15)  = 
y3(13)  : 
y3(10)  : 
y3(7) 
y3(4) 
y3(l) 
y3(15)  : 

else 
y3(13)  : 
y3(10)  : 
y3(7)  :  = 
y3(4)  :  = 
y3(l)  :  = 
y3(15)  : 

end  if; 


'0'  then 

=  x3(15);  y3(12)  :=  x3(14);  y3(ll)  :=  x3(13); 
=  x3(12);  y3(9)  :=  X3(ll);  y3(8)  :=  x3(10); 
=  X3(9);  y3(6)  :=  x3(8);  y3(5)  :=  x3(7); 
=  x3(6);  y3(3)  :=  x3(5);  y3(2)  :=  x3(4); 
:  x3(3);  y3(0)  :=  x3(2);  y3(14)  :=  '0'; 
=  '0'; 

=  x3(15);  y3(12)  :=  x3(14);  y3(ll)  :=  x3(13); 
=  x3(12);  y3(9)  :=  X3(ll);  Y3(8)  :=  X3(10); 
=  X3(9);  y3(6)  :=  x3(8);  y3(5)  :=  x3(7); 
=  x3(6);  y3(3)  :=  x3(5);  y3(2)  :=  x3(4); 
=  x3(3);  y3(0)  :=  x3(2);  y3(14)  :=  '1'; 
=  T; 


if  x4(15)  = 
y4(13) 
y4(l0) 
y4(7): 
y4(4)  : 
y4(l)  : 
y4(15) 

else 
y4(13) 
y4(10) 
y4(7): 
y4(4)  : 
y4(l)  : 
y4(15) 

end  if; 


=  '0'then 
:=  x4(15);  y4(12) 
:=  x4(12);  y4(9)  : 
=  X4(9);y4(6):  = 
=  x4(6);y4(3):  = 
=  x4(3);y4(0):  = 
:=  '0'; 

:=  x4(15);  y4(12) 
:=x4(12);y4(9): 
=  X4(9);y4(6):  = 
=  x4(6);y4(3):  = 
=  x4(3);y4(0):  = 
:=  T; 


:=  x4(14);  y4(ll)  :=  x4(13); 
=  X4(ll);y4(8):=  x4(10); 
x4(8);  y4(5)  :=  x4(7); 
x4(5);y4(2):=  x4(4); 
x4(2);y4(14)  :=  '0'; 


:=  x4(14);  y4(ll)  :=  x4(13); 
=  X4(ll);  Y4(8)  :=  X4(10); 
x4(8);  y4(5)  :=  x4(7); 
x4(5);y4(2):=x4(4); 
x4(2);y4(14):=  '1'; 
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ifx5(15)  =  '0'  then 

y5(13)  :=  x5(15);  y5(12)  :=  x5(14);  y5(ll)  :=  x5(13); 

y5(10)  :=  x5(12);  y5(9)  :=  X5(ll);  y5(8)  :=  x5(10); 

y5(7)  :=  X5(9);  y5(6)  :=  x5(8);  y5(5)  :=  x5(7); 

y5(4)  :=  x5(6);  y5(3)  :=  x5(5);  y5(2)  :=  x5(4); 

y5(l)  :=  x5(3);  y5(0)  :=  x5(2);  y5(14)  :=  '0'; 

y5(15)  :=  '0'; 
else 

y5(13)  :=  x5(15);  y5(12)  :=  x5(14);  y5(ll)  :=  x5(13); 

y5(10)  :=  x5(12);  y5(9)  :=  x5(ll);  y5(8)  :=  x5(10); 

y5(7)  :=  X5(9);  y5(6)  :=  x5(8);  y5(5)  :=  x5(7); 

y5(4)  :=  x5(6);  y5(3)  :=  x5(5);  y5(2)  :=  x5(4); 

y5(l)  :=  x5(3);  y5(0)  :=  x5(2);  y5(14)  :=  '1'; 

y5(15)  :=  '1'; 
end  if; 

if  x6(15)  =  '0'  then 

y6(13)  :=  x6(15);  y6(12)  :=  x6(14);  y6(ll)  :=  x6(13); 

y6(10)  :=  x6(12);  y6(9)  :=  X6(ll);  y6(8)  :=  x6(10); 

y6(7)  :=  X6(9);  y6(6)  :=  x6(8);  y6(5)  :=  x6(7); 

y6(4)  :=  x6(6);  y6(3)  :=  x6(5);  y6(2)  :=  x6(4); 

y6(l)  :=  x6(3);  y6(0)  :=  x6(2);  y6(14)  :=  '0'; 

y6(15)  :=  '0'; 
else 

y6(13)  :=  x6(15);  y6(12)  :=  x6(14);  y6(ll)  :=  x6(13); 

y6(10)  :=  x6(12);  y6(9)  :=  x6(ll);  y6(8)  :=  x6(10); 

y6(7)  :=  X6(9);  y6(6)  :=  x6(8);  y6(5)  :=  x6(7); 

y6(4)  :=  x6(6);  y6(3)  :=  x6(5);  y6(2)  :=  x6(4); 

y6(l)  :=  x6(3);  y6(0)  :=  x6(2);  y6(14)  :=  '1'; 

y6(15)  :=  '1'; 
end  if; 

ifx7(15)  =  '0'  then 

y7(13)  :=  x7(15);  y7(12)  :=  x7(14);  y7(ll)  :=  x7(13); 
y7(10)  :=  x7(12);  y7(9)  :=  X7(ll);  y7(8)  :=  x7(10); 
y7(7)  :=  X7(9);  y7(6)  :=  x7(8);  y7(5)  :=  x7(7); 
y7(4)  :=  x7(6);  y7(3)  :=  x7(5);  y7(2)  :=  x7(4); 
y7(l)  :=  x7(3);  y7(0)  :=  x7(2);  y7(14)  :=  '0'; 
y7(15)  :=  '0'; 
else 

y7(13)  :=  x7(15);  y7(12)  :=  x7(14);  y7(ll)  :=  x7(13); 
y7(10)  :=  x7(12);  y7(9)  :=  x7(ll);  y7(8)  :=  x7(10); 
y7(7)  :=  X7(9);  y7(6)  :=  x7(8);  y7(5)  :=  x7(7); 
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y7(4)  :=  x7(6);  y7(3)  :=  x7(5);  y7(2)  :=  x7(4); 
y7(l)  :=  x7(3);  y7(0)  :=  x7(2);  y7(14)  :=  '1'; 
y7(15)  :=  '1'; 
end  if; 

if  x8(15)  =  '0'then 

y8(13)  :=  x8(15);  y8(12)  :=  x8(14);  y8(ll)  :=  x8(13); 

y8(10)  :=  x8(12);  y8(9)  :=  X8(ll);  y8(8)  :=  x8(10); 

y8(7)  :=  X8(9);  y8(6)  :=  x8(8);  y8(5)  :=  x8(7); 

y8(4)  :=  x8(6);  y8(3)  :=  x8(5);  y8(2)  :=  x8(4); 

y8(l)  :=  x8(3);  y8(0)  :=  x8(2);  y8(14)  :=  '0'; 

y8(15)  :=  '0'; 
else 

y8(13)  :=  x8(15);  y8(12)  :=  x8(14);  y8(ll)  :=  x8(13); 

y8(10)  :=  x8(12);  y8(9)  :=  x8(ll);  y8(8)  :=  x8(10); 

y8(7)  :=  X8(9);  y8(6)  :=  x8(8);  y8(5)  :=  x8(7); 

y8(4)  :=  x8(6);  y8(3)  :=  x8(5);  y8(2)  :=  x8(4); 

y8(l)  :=  x8(3);  y8(0)  :=  x8(2);  y8(14)  :=  '1'; 

y8(15)  :=  '1'; 
end  if; 

srl  <  =  yl;  sr2  <  =  y2;  sr3  <  =  y3;  sr4  <  =  y4; 
sr5  <  =  y5;  sr6  <  =  y6;  sr7  <  =  y7;  sr8  <  =  y8; 
i:=  i+lj 
if  i  =  8  then 

bl  <  =  yl;  b2  <  =  y2;  b3  <  =  y3;  b4  <  =  y4; 
b5  <  =  y5;  b6  <  =  y6;  b7  <  =  y7;  b8  <  =  y8; 
xl  :=  clr;  x2  :=  clr;  x3  :=  clr;  x4  :=  clr; 
x5  :=  clr;  x6  :=  clr;  x7  :=  clr;  x8  :=  clr; 
srl  <  =  clr;  sr2  <  =  clr;  sr3  <  =  clr;  sr4  <  =  clr; 
sr5  <  =  clr;  sr6  <  =  clr;  sr7  <  =  clr;  sr8  <  =  clr; 
i:=  0; 
end  if; 
wait  on  al,a2,a3,a4,a5,a6,a7,a8,clr,CLK; 
end  process; 
end  beh; 

Test  bench 

use  work. pack  1. all; 
entity  test  is  end  test; 
architecture  str  of  test  is 
component  clockge  port(CLCK  :inout  bit); 
end  component; 
component  clock  port(CLK  :inout  bit); 
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end  component; 

component  control  port(CLK  :  bit;ct  :  out  bit); 

end  component; 

component  LOAD  port(AI :  in  bit_vector(15  downto  0); 

B0,B1,B2,B3,B4,B5,B6,B7  .  Quf  bit  vector(15  downto  0); 

CLK  :  in  bit); 
end  component; 
component  shift 

port(bi0,bil,bi2,bi3,bi4,bi5,bi6,bi7:  in  bit_vector(15  downto  0); 
bo0,bol,bo2,bo3,bo4,bo5,bo6,bo7:  out  bit_vector(l  downto  0); 
CLK  :  in  bit); 
end  component; 
component  adsu 

port(a0,al,a2,a3,a4,a5,a6,a7  :  bit_vector(l  downto  0); 
D0,bl,b2,b3,b4,b5,b6,b7  :  out  bit_vector(l  downto  0); 
CLK,cr,st  :  bit); 
end  component; 
component  reg 

port(a0,al,a2,a3,a4,a5,a6,a7  :  bit_vector(l  downto  0); 
b0,bl,b2,b3,b4,b5,b6,b7  :  out  bit  vectord  downto  0); 
CLK  :  bit); 
end  component; 
component  rom 

port(e0,el,e2,e3,e4,e5,e6,e7  :  bit_vector(l  downto  0); 

bl0,bll,b20,b21,b30,b31,b40,b41,b50,b51,b60,b61,b70,b71,b80,b81: 
out  bit_vector(15  downto  0); 
CLK  :  bit); 
end  component; 
component  shil 

Port(fl,f2,f3,f4,f5,f6,n,f8,f9,n0,ni,n2,n3,n4,n5,n6: 
bit_vector(15  downto  0); 

bl0,bll,b20,b21,b30,b31,b40,b41,b50,b51,b60,b61,b70,b71,b80,b81: 
out  bit_vector(15  downto  0); 
CLK  :  bit); 
end  component; 
component  delay  1 

port  (a:  bit;b:  out  bit;CLK:  bit); 
end  component; 
component  delay2 

port(a:  bit;b:  out  bit;CLK:  bit); 
end  component; 
component  delay3 
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port  (a:  bit;b:  out  bit;CLK:  bit); 
end  component; 

component  delay4 

port(a:  bit;b:  out  bit;CLK:  bit); 
end  component; 
component  delay  15 

port(a:  bit  ;b:  out  bit;CLK:  bit); 
end  component; 
component  delay  16 

port(a:  bit;b:  out  bit;CLK:  bit); 
end  component; 
component  delay  17 

port  (a:  bit;b:  out  bit;CLK:  bit); 
end  component; 
component  delay  18 

port(a:  bit;b:  out  bit;CLK:  bit); 
end  component; 
component  add  g 

Port(al,a2,a3,a4,a5,a6,a7,a8,a9,al0,all(al2,al3,al4,al5,al6: 
bit_vector(15  downto  0); 

bl,b2,b3,b4,b5,b6,b7,b8  :  out  bit  vector(15  downto  0); 
CLK,as  :  bit); 
end  component; 
component  reg_h 

port(a0,al,a2,a3,a4,a5,a6,a7  :  bit_vector(15  downto  0); 
D0,bl,b2,b3,b4,b5,b6,b7:  out  bit  vector(15  downto  0); 
CLK  :  bit); 
end  component; 
component  add  i 

Port(al,a2,a3,a4,a5,a6,a7,a8,a9,al0,all,al2,al3,al4,al5,al6: 
bit_vector(15  downto  0);bl,b2,b3,b4,b5,b6,b7,b8 : 
out  bit_vector(15  downto  0);CLK  :  bit); 
end  component; 
component  shi_2 

port(al,a2,a3,a4,a5,a6,a7,a8  :  bit_vector(15  downto  0); 
Srl,sr2,sr3,sr4,sr5,sr6,sr7,sr8,bl,b2,b3,b4,b5,b6,b7,b8 : 
out  bit_vector(15  downto  0);clr  :  bit_vector(15  downto  0); 
CLK  :  bit); 
end  component; 
component  result 

port(al,a2,a3,a4,a5,a6,a7,a8  :  bit_vector(15  downto  0); 
k  :  out  bit_vector(15  downto  0);  CLK  :  bit ); 
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end  component; 

for  C:  c!ock_ge  use  entity  work.clockge(clkctl); 

for  ad:  clock  use  entity  work.clock(beh); 

for  a  :  control  use  entity  work.control(beh); 

for  L  :  LOAD  use  entity  work.LOAD(BEH); 

for  S  :  shift  use  entity  work.shift(beh); 

for  D  :  adsu  use  entity  work.adsu(beh); 

for  r  :  reg  use  entity  work. reg (ben); 

for  o  :  rom  use  entity  work. rom (ben); 

for  si  :  shil  use  entity  work.shi_l(beh); 

for  b  :  delayl  use  entity  work. delay l(beh); 

for  e  :  delay2  use  entity  work.delay2(beh); 

for  dely3  :  delay3  use  entity  work. delay 3 (beh); 

for  dely4  :  delay 4  use  entity  work. delay 4(beh); 

for  delyl5  :  delaylS  use  entity  work.deIayl5(beh); 

for  delyl6  :  delayl6  use  entity  work.delayl6(beh); 

for  delyl7  :  delay  17  use  entity  work.delayl7(beh); 

for  delyl8  :  delay  18  use  entity  work.de!ayl8(beh); 

for  g  :  add_g  use  entity  work.add_g(beh); 

for  h  :  reg_h  use  entity  work,  regh  (beh); 

for  i  :  addi  use  entity  work. addi  (beh); 

for  j  :  shi_2  use  entity  work.shi_2(beh); 

for  t  :  result  use  entity  work. result  (beh); 

signal  di  :  bit_vector(15  downto  0); 

signal  ck  :  bit; 

signal  clck  :  bit; 

signal  go  :  bit; 

signal  io  :  bit; 

signal  ho  :  bit; 

signal  te  :  bit; 

signal  de  :  bit; 

signal  op,qr,st,eo,ko,mo,qo,ro,so,uo  :  bit; 

signal  d0,dl,d2,d3,d4,d5,d6,d7 :  bit_vector(15  downto  0); 

Signal  So0,sol,so2,so3,so4,so5,so6,so7  :  bit_vector(l  downto  0); 

signal  Co0,col,co2,co3,co4,co5,co6,co7  :  bit_vector(l  downto  0); 

signal  do0,dol,do2,do3,do4,do5,do6,do7:  bit_vector(l  downto  0); 

signal  clr  :  bit  :  =  '0'; 

signal  set  :  bit  :  =  '0'; 

signal  el,e2,e3,e4,e5,e6,e7,e8,e9,el0,ell,el2,el3,el4,el5,el6 : 

bit_vector(15  downto  0); 
signal  fl,f2,f3,f4,f5,f6,n,f8,f9,fl0,fll,fl2,fl3,fl4,n5,n6: 
bit  vector(15  downto  0); 
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signal  gl,g2,g3,g4,g5,g6,g7,g8  :  bit_vector(15  downto  0); 

signal  hl,h2,h3,h4,h5,h6,h7,h8  :  bit_vector(15  downto  0); 
signal  il,i2,i3,i4,i5,i6,i7,i8  :  bit_vector(15  downto  0); 
signal  J1J2J3J4J5J6J7J8  :  bit_vector(15  downto  0); 
signal  rl,r2,r3,r4,r5,r6,r7,r8  :  bit_vector(15  downto  0); 
signal  cr  :  bit_vector(15  downto  0)  :=  "0000000000000000"; 
signal  p  :  bit_vector(15  downto  0); 
begin 

C  :  clock_ge  port  map(ck); 
ad  :  clock  port  map(clck); 
a  :  control  port  map(ck,go); 
b  :  delay  1  port  map(go,io,ck); 
e  :  delay!  port  map(ck,ho,clck); 
dely3  :  delay3  port  map(ho,te,clck); 
dely4  :  delay4  port  map(te.de,clck); 
delyl5  :  delaylS  port  map(io,eo,ck); 
delyl6  :  delay  16  port  map(eo,ko,ck); 
delyl7  :  delayl7  port  map(ko,mo,ck); 
delyl8  :  delay  18  port  map(mo,qo,ck); 
L  :  LOAD  port  Hiap(di,d0,dl,d2,d3,d4,d5,d6,d7,ck); 
S  :  shift  port  map(d0,dl,d2,d3,d4,d5,d6,d7, 

So0,sol,so2,so3,so4,so5,so6,so7,de); 
D  :  adsu  port  map(so0,sol,so2,so3,so4,so5,so6,so7, 
Co0,col,co2,co3,co4,co5,co6,co7, 
ck,clr,set); 
r  :  reg  port  map(co0,col,co2,co3,co4,co5,co6,co7, 
do0,dol,do2,do3,do4,do5,do6,do7, 
ck); 
o  :  rom  port  map(do0,dol,do2,do3,do4,do5,do6,do7, 

el,e2,e3,e4,e5,e6,e7,e8,e9,el0,ell,el2,el3,el4, 
el5,el6,ck); 
s_l  :  shi_l  port  map(el,e2,e3,e4,e5,e6,e7,e8,e9,el0,ell,el2,el3,el4,el5,el6, 
fl,f2,f3,f4,f5,f6,n,f8,f9,fl0,fll,fl2,n3,n4,n5,n6, 
ck)* 
g  :  add_g  portmap(n,f2,f3,f4,f5,f6,n,f8,f9,n0,fll,n2,n3,fl4,fl5,n6, 
gl,g2,g3,g4,g5,g6,g7,g8,ck,qo); 
h  :  reg  h  port  map(gl,g2,g3,g4,g5,g6,g7,g8,hl,h2,h3,h4,h5,h6,h7,h8,ck); 

i  :  add_i  port  map(hl,rl,h2,r2,h3,r3,h4,r4,h5,r5,h6,r6,h7,r7,h8,r8, 

H,i2,i3,i4,i5,i6,i7,i8,ck); 
j  :  shi_2  port  map(il,i2,i3,i4,i5,i6,i7,i8,rl,r2,r3,r4,r5,r6,r7,r8, 

jlj2J3j4J5J6J7j8,cr,ho); 
t  :  result  port  map(jlJ2J3J4J5J6j7j8,p,ck); 
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set  <  =  '1'  after  5  ns; 
di  <  =  "0000010110101000"  after  7  ns, 
"0000000000000000"  after  17  ns, 
"0000010110101000"  after  27  ns, 
"0000101101010000"  after  37  ns, 
"0000010110101000"  after  47  ns, 
"0000000000000000"  after  57  ns, 
"0000010110101000"  after  67  ns, 
"0000101101010000"  after  77  ns; 
end  str; 
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APPENDIX  C.  MATLAB  PROGRAM  OF  DECIMAL-BINARY  CONVERSION 

while  1 

x(l,16)  =  0; 

y  =  input  ('Please  enter  your  number  :  '); 

if  y  ==  0 

break 
end 
disp('wait!'); 

if  y  >  o, 

x(l)  =  0; 
else 

x(l)  =  1; 

y  =  abs(y); 
end 
i  =  2; 
for  k  =  1:15; 

if  y  >  l, 

x(i)  =  fix(y); 
y  =  y  -  x(i); 
else 

x(i)  =  0; 
end 

y  =  2  *  y; 
i  =  i  +  1; 
end 
disp(x); 
end 
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APPENDIX  D.  STRUCTURAL  ID  DCT  HAND  CALCULATION 


® 


0000000000000000 
+  0000000000000000 
0000000000000000 
.    0000000000000000 
0101101010000010 
0101101010000010 
+  0000000000000000 
0101101010000010 
0010110101000001 
+  0101101010000010 
rvftrftatt/         0111000100100010 
^<t  0001011010100000 
^    ^000011111000010 
®      0010110101000001 
+  0101101010000010 


® 


0111000100100010 

+  0010000111110000 

1001001100010010 


® 


5)     0010110101000001 

}*  0000000000000000 

0001011010100000 

t    0010010011000100 

0011101101100100 

0010110101000001 

5>  0000000000000000 


(6> 


uo 


0001011010100000 
+  0000111011011001 


0010010101111001 
7)     0010110101000001 
+0000000000000000 


® 


0001011010100000 
t  0000100101011110 

0001111111111110 
8)  0000000000000000 
•  0000000000000000 

0000000000000000 
+  0000011111111111 

0000011111111111 


Fig.  19  UO  hand  calculation 
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© 


U  0000000000000000 
+  OQQOQOOOOOOOOOOO 

0000000000000000 
\2)  0000000000000000 
+  0101001000000011 

0101001000000011 
+  0000000000000000 

0101001000000011 
(3)  0001100000000101 
+  OQ111Q0111111101 

0100010111111111 
+  0001010010000000 

1101101001111111 
0001100000000101 
+  0011100111111101 

0100010111111111 
+  0001011010011111 

0101110010011110 


(Jf 


vl 


® 


0011100111111101 

+   0001100000000101 
0011010100000011 

+   0001011100100111 
0100110000101010 
0011100111111101 

+   0001100000000101 
0011010100000011 

+0001001100001010 
0100100000001101 

0001100000000101 
0001100000000101 
0010010000000111 
0001001000000011 
0011011000001010 
0001100000000101 
0001100000000101 
1111001111111101 

+   0000110110000010 
0000000101111111 


Fig.  20   VI  hand  calculation 
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®  0000000000000000 

+  noooooooooonoooo 

0000000000000000 
®  0000000000000000 
+  111QQQ110011Q1QO 


11 


®" 


IIIMIIIIIIIIIII 


0001100110100 
00111011010111 


+  OOO1O1OOO1O11101 


11 

±1A 


®1! 


11 
.til 


11 


0001100110100 


1101111001000 

110Q01 1001 101 


1010010010101 
00111011010111 


+  0001010001011101 


1101111001000 
1110100100101 


1100011101101 


V3 


© 


(5)      0001010001011101 
t  1100111011010111 

1101100100000101 
+  1111111000111011 

1101011101000000 
0001010001011101 
+  1100111011010111 

1101100100000101 
+  1111010111010000 

1100111011010101 

(7)      1100111011010111 
+  1100111011010111 

1011011001000010 
+  1111001110110101 


1010100111110111 

8)      1100111011010111 

•  1100111011010111 

0001100010010100 

+  1110101001111101 

0000001100010001 


Fig.  21   V3  hand  calculation 
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(D  0000000000000000 

+  0000000000000000 

0000000000000000 

(2)  0000000000000000 

+  OOQOQOOQQOOQOOOO 

0000000000000000 

d)    0010110101000001 

+  mooomooonooooo 

0001011010100000 

+  oooooooooooflooofl 

0001011010100000 
©    0010110101000001 
+  0000000000000000 

0001011010100000 

»  0000010110101000 

0001110001001000 


U4 


®    1101001010111111 
+  0000000000000000 
1110100101011111 
+  0000011100010010 

1111000001110001 
(6)    1101001010111111 
+  0000000000000000 

1110100101011111 
t  1111110OOOO111OO 

1110010101111011 
®    0010110101000001 


mmmmmm 


0001011010100000 

+  1111100101011110 

0000111111111110 

D  0000000000000000 

-  0000000000000000 

0000000000000000 
+  0000001111111111 

0000001111111111 


Fig.  22  U4  hand  calculation 
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®  0000000000000000 

*  ononoooooooooooo 

0000000000000000 
(D  0000000000000000 
+  0OO1OO11OO11111O 

0001001100111110 

+  0OOOOOOOOO0OOO0Q 
0001001100111110 
®  0010000011011001 

+  1111001001100101 

0000001011010001 
+  0000010011001111 

0000011110100000 
®  0010000011011001 

+  1111001001100101 

0000001011010001 

+  0000000111101000 

0000010010111001 


®  1111001001100101 

+  ooiooonoi 1011001 

0001101000001011 

+  oooonnoimioino 

0001101100111001 

©  1111001001100101 

♦  OO10OOOO110110O1 

0001101000001011 

V5  »  OO00O11O11OO111O 

0010000011011001 

©  0010000011011001 

+  0010000011011001 

0011000101000101 
+  0OQO1OO00O11O11O 
0011100101111011 
©  0010000011011001 

•  ooioooom  1011001 

1110111110010011 

t  O0OO111OO1O11110 

1111110111110001 


Fig.  23   V5  hand  calculation 
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© 


0000000000000000 

*  ooooooooononmoo 

0000000000000000 
(D  0000000000000000 
+  1110111110110000 

1110111110110000 


■IWiilWIiIlliUlUIU 


1110111110110000 

(D  1111101100111001 

+  1111010001110111 
1111001000010011 

*  1111101111101100 
1110110111111111 
®  1111101100111001 

+  iiiimnmiiiniii 

1111001000010011 
+  1111101101111111 

1110110110010010 


®  1111010001110111 

iiiiniinmiinm 


© 


V7    + 


(!) 


111010101110100 

iiiifliioimm  on 


111000011011000 

1111010001110111 

111101100111001 


111010101110100 

111110000110110 


111000110101010 

1111101100111001 

111101100111001 


111100011010101 
111110001101010 


111010100111111 

1111101100111001 

111101100111001 


0000001001100011 
111110101001111 


111111110110010 


Fig.  24   V7  hand  calculation 
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APPENDIX  E.    FORMATION  OF  2-BIT  ADDER 


A.      TWO  BIT  ADDER  TRUTH  TABLE 


Table  XX   Truth  table  of  2-bit  adder 


A, 

A0 

B, 

Bo 

c, 

qi 

q0 

C0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

1 

0 

0 

0 

0 

1 

0 

0 

1 

0 

0 

0 

0 

1 

1 

1 

0 

0 

0 

0 

1 

0 

0 

1 

0 

0 

0 

0 

1 

0 

1 

1 

1 

0 

0 

0 

1 

1 

0 

1 

1 

0 

0 

0 

1 

1 

1 

0 

0 

1 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

1 

0 

0 

0 

0 

1 

0 

1 

0 

0 

0 

0 

1 

1 

1 

1 

0 

0 

1 

0 

0 

1 

1 

0 

0 

1 

0 

1 

0 

0 

1 

0 

1 

1 

0 

0 

0 

1 

0 

1 

1 

1 

0 

1 

1 
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Table  XXI   (Table  XX)  continue 


A, 

A0 

B, 

Bo 

Q 

qi 

qo9 

Co 

0 

0 

0 

0 

i 

0 

0 

0 

0 

0 

1 

i 

1 

0 

0 

0 

1 

0 

i 

1 

0 

0 

0 

1 

1 

0 

0 

0 

1 

0 

0 

0 

0 

0 

1 

0 

1 

0 

1 

0 

1 

1 

0 

0 

1 

0 

1 

1 

1 

1 

0 

0 

0 

0 

1 

1 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

0 

1 

1 

0 

1 

1 

0 

0 

0 

1 

1 

0 

1 

1 

0 

1 

1 

0 

1 

0 

1 

1 

1 

1 

1 

Two  bit  adder  has  five  inputs,  three  outputs.  A1?  A0,  B,,  and  B0  represent  the  input 
and  Q  represents  the  carrier  in.  Q,,  q0  represent  the  output  and  C0  represents  the  carrier 
out.  After  the  set  up  of  truth  table,  reduction  can  be  made  by  Karnaugh  map. 
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B1  \A1A0  JO  -  Q<  B1  NMAO  rtpi  =  0^ 

bq\oo  01 10  ii      . .  .box.oo.PT io_y. 

A1A0B1B0 


A1B1BOO 


AiMBia 


AIBIBOQ 


A1A0B1O 


MBlBOa 


B1BCQ 
AOB1CI 


Fig.  25  Karnaugh  map  reduction 

Karnaugh  map  reduction  gives  the  reduced  boolean  expression. 

qx  =  A^C^Bfifi^Afififi^A^C^fi^A^fi^ 


m^am^am'v^^aVi^iWi^iWi        (36) 
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q0  =  AJS0Ct  *  A0B0Ci  *  A^C,  *  A^0Ct 


(37) 


C0  =  A1A0B0+A0BlB0+AtA0Ci+AlB0Ci+BiB0Ci+AA+A0BlCi 


(38) 
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